The Best of Dates, The Worst Of Dates

shortcut to
ISO 8601
standard
Summary: General summary of calendars, dates, and times from past to the present. This includes details on using the ISO 8601 standard in and out of computers. It ends with information for software developers wanting to properly use dates and times.

2003-07-09   02:06 UTC


By Gilbert Healton
ghealton@exit109.com

(Table Of Contents)

This document should be interesting to anyone interested in dates, times, or calendars. Readers may be regular people, managers, or people that develop computer software. The document has been written so that anyone may start at the beginning and stop once they have had enough on the subject.

The document starts with a short history of calendars and timekeeping. Next comes a discussion of various standards, especially the "new" ISO 8601 that is gaining in popularity on an international basis. The final sections are for authors of computer programs.

The technical sections start with general use of date and times and then advance to specific examples in different programming languages. While these final sections cover many programming languages, the main effort is for C and UNIX along with derived languages and operating systems, such as Linux, C++, and the Javas.

Just because Y2K is over doesn't mean programmers are done creating date bugs.

This document is under revision and subject to change (2002-12). Indeed, work is (slowly) being done on a major revision of this document.

Some of the links that have expired since this document was first written have yet to be found and corrected by this author. Please visit http://www.exit109.com/~ghealton/.dates.html for the list of my current time and date links (dot dates dot html).

While this document was originally written before the Year 2000 to help software developers cope with it, and much of the tense herein is past tense, the majority of the information still applies to new programs being written today.
The information in this document is on a best effort basis. The author actively solicits additions, corrections, and clarifications. Because this documentation and code herein is free of charge, there is no warranty for the contents. All use of the information herein is at the user's own risk.
Webmasters with links to this site should notify the author to be advised of the update. This is important as the new file will have a new file name. Perhaps even a new domain name. This list of registered references is included in http://www.exit109.com/~ghealton/.home.html (dot home dot html) under References To My Web Pages. Prefer links similar to:

<a href="http://www.exit109.com/~ghealton/y2k/yrexamples.html"><EM>The Best Of Dates, The Worst Of Dates</EM></A>
General summary of calendars, dates, and times from past to the present. This includes details on using the ISO 8601 standard in and out of computers. Ends with information for software developers wanting to properly use dates and times.

Portions of this document copyright 1995-2002 by Gilbert Healton (ghealton@exit109.com)
All rights reserved. Permission is granted to reproduce the document, in whole or in part, for local use providing this copyright notice, this table box, and the title, including author's name, are reproduced in full and downloaders keeps their archives current by checking back at least once a quarter or registering with the author to be advised of any significant updates.

Table Of Contents

If your browser is smart enough, selecting any of the following index items should take you to the indicated section. Once in the section selecting the section number should return you to this table of contents.

1. Trademarks
2. Introduction
3. A Brief History Of Dates
3.1 Old Roman Calendar
3.2 Julian Calendar
3.3 Gregorian Calendar and Leap Years
3.4 British Calendar Correction
3.5 Other Calendars
3.6 Time Of Day
3.6.1 New Days
3.6.2 Time Zones
3.6.3 Daylight Saving Time
3.6.4 Leap Seconds and UTC
4. International Standards
4.1 Introduction
4.2 ISO 8601 Standard
4.2.1 Introduction
4.2.2 Date Formats
4.2.3 Additional Details
4.3 RFC-822 Date Standard
4.3.1 Introduction
4.3.2 Date Format
4.3b RFC-2822 Internet Message Formats
4.4 Julian Days versus Ordinal Calendars
4.5 When Did The Third Millennium Begin?
5. Miscellaneous Things
5.1a Date Numerology
5.1b Odd Day
5.2 Julian! Julian! Who Art Thou Julian?
6. Things That Go "00" In The Year
6.1 Often Overlooked Problems
6.2 Aggressive Two-Digit Years
6.3 Other Bad Years
6.4 Date Set backs: The Ugly Y2K Work Around
7. Native Computer Time Formats
7.1 Introduction
7.2 History Of Computer Dates and Time
7.3 Native UNIX Time Formats
7.3.1 time_t values
7.3.2 2038 time_t bug
7.3.3 Bad Casting
7.3.4 time_t And Timezones
7.4 Native C Time Formats
7.5 Beyond UNIX and C
7.6 IBM PC, and compatibles, Time
7.7 Microsoft Times
7.7.1 Microsoft Epochs
7.7.2 Microsoft 100ns Ticks
7.8 Mainframe Time
7.9 Macintosh Time
7.10 OS/2 Time
8. Bad Coding Techniques
8.1 Good Code
8.2 Ways To Abuse Good Formats
8.2.1 Errors In Data Types
8.2.2 Errors In Date Calculations
8.2.3 Errors In Reading Dates
8.2.4 Errors In Printing Years
8.3 Leap Year Bugs
8.3.1 Proper Technique
8.3.2 Bad Tests
8.3.3 Leap Year Legends
8.4 Length Of Day Bugs
8.5 Makefile and Distribution Bugs
8.6 Troff/Nroff Bugs
8.7 Magic Dates
8.8 1999-August-22 GPS Overflow
8.9 Microsoft April-01 DST Bug
8.10 Shell Script Bugs
8.11 JavaScript Bugs
8.11.1 General JavaScript Time Bugs
8.12.2 JavaScript Bonus Bug
8.12 System Bugs
9. Testing Applications By Changing CPU Dates
9.1 Introduction
9.2 Hazards Of Changing The Time
9.3 Before Changing The Time
9.4 After Changing The Time
10. Good Coding Techniques
10.1 Also See
10.2 Data Types
10.2.1 Use Native Types
10.2.2 C Language Types
10.2.3 Perl Types
10.2.4 Troff/Nroff Macros
10.2.4.1 Introduction
10.2.4.2 Four-digit Years
10.2.4.3 Two-digit Years
10.2.4.4 ISO 8601 Dates
10.3 UNIX Data Types
10.4 Leap Years
10.5 Ordinal Dates
10.6 Julian Days
10.7 Integer Days
10.8 Pivot Years
10.8.1 Introduction
10.8.2 Fixed Pivot Years
10.9 Accuracy versus Performance
10.10 Coping With Untrusted Dates
10.11 Interval Timers versus Time Of Day
11. References
12. Credits

1. Trademarks

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Trademarks this author is aware of use the capitalization associated with the trademark. Known trademarks follow. Click on the trademark to get the official Trademark Usage Guide from the trademark owner (if available on the web). Click on the owner name to get the owner's home page (again, if available).


 
Please let the author know of any trademarks used in this document that are not listed here.

2. Introduction

This document provides general information about dates, times, and international standards, for both the general public and computer programmers.

The section for general readers provide a history of calendars, times, and related background information many find interesting. The document starts with a section for general readers and gets more complex as it goes along. Therefore general readers may simply read until they've had enough on the subject. A few pieces of technical gibberish have been sprinkled throughout this section to provide important information to technical readers. Normal people may simply ignore these short bursts of strangeness.

The sections for programmers describe ways programs misuse or properly use dates and times. How to spot and repair bad date code along with writing good date code. Year 2000 problems are just one point covered herein.

[Y2K] leads information describing techniques frequently associated with Year 2000 problems. Clicking on the [Y2K] should advance you through a circular list of all [Y2K] notes in this document.

Y2K presents a unique challenge to the computer industry and its users. Never before has such an immovable deadline hit so much of the industry. Year 2000 can not be postponed if the software is late. Unlike typical bugs, which strike at random intervals or at those customers trying something new, Y2K can strike across the entire customer base, striking at the same time, at customers that haven't changed anything. Not just in one part of the program, but at numerous locations of many programs. Critical supplies from key vendors are also at risk at the same time, both for Y2K and overloading, which may reduce the resources you have available to you for repairing your problems.

3. A Brief History Of Dates

3.1 Old Roman Calendar

The Roman Calendar, which the current Gregorian Calendar is derived from, was viewed differently by the Romans than we do today. Rather than counting days up in the month, they set two key reference points in each month and counted the days remaining until the reference point. The first of the month was the Kalendae (or Calends in today's desk dictionary, sometimes spelled as Kalends in other sources) while the middle of the month was the Idus (or Ides). The Ides was the 13'th day of most months and the 15'th day for months with 31 days.

One day before the Ides of March was March-14.
One day before the Calends of April was March-31.

(With the Year 2000 problem, we need to "beware the Calends of January".)

This author has also seen references to the Nonae (Nones) which seems to be the 9'th day before the Idesi (13th or 15th of the month), Adjust the previous confusion accordingly if you encounter this term.

The old Roman Calendar also considered the Calends of January to start a new year (personal observation: shortly after the winter days start to get longer). However, when it came along, the Christian Church changed this in 567 by order of the Council of Tours, for religious/political purposes. However, exactly what day years should start on was not well specified and varied from country to country and time to time. March was a popular month to start new years (spring time, when the earth came alive again), though there was no agreement on what day in March should start a new year. The Calends and Ides of March were among the more popular selections for March new years.

3.2 Julian Calendar

By 46 BC the Roman calendar was a major mess. Among other things the lack of leap years had gradually lost enough days to make the calendar claim spring, the start of a new year, the Calends of March, actually occurred in what was really late November, a few days before the Calends of December. Not a good time to plant your spring crops.

Julius Caesar, under the advice of the astronomer Sosigenes of Alexandria, issued a decree that lengthened 46 BC to 445 days to bring the calendar year into line with the solar year. This made the longest year on record.

As part of the 46 BC correction, subsequent years had an extra day added six days before the Calends of March every fourth year. Rather than making a separate day number, this month had two "six days before the Calends of March", or "two sixes", which the current word bissextile is derived from. Thus February-24 is historically the leap day, not February-29. In the American and the European community this is often called the "Old Style Date" or "Old Style Calendar".

Ignoring differing leap years, the months used in this calendar match the months in the current Gregorian Calendar.

Unfortunately, the priests of the time greatly botched leap year calculations after this calendar correction and it took a few decades to discover this fact. Claus Tøndering's Calendar FAQ details the aftermath of these errors lasting through 4 AD, which omitted 0004-02-29.

3.3 Gregorian Calendar and Leap Years

Every 128 years the Julian Calendar became off by an additional day. By 1263 Pope Urban IV received a letter from Roger Bacon urging a calendar adjustment to correct the current error and prevent future errors. In 1582 Pope Gregory XIII, working with the Council of Trent, acted on this solution when it was proposed again. The 10 extra days in the year were taken from October to restore the equinox to March-21.

The rules of 100 and 400 in calculating leap years were also added at this time.

Leap years are years evenly divisible by 4, unless the year is evenly divisible by 100. However years evenly divisible by 400 remain leap years.

In the American and the European community this is often called the "New Style Date" or "New Style Calendar".

While the Gregorian Calendar was surprisingly accurate, it was slow to be adopted by some countries, especially in Protestant countries. The Protestant Reformation started in 1517-October-31 and, for political reasons, often greatly delayed the adoption of the Gregorian Calendar in Protestant countries.

3.4 British Calendar Correction

Effective in 1752 the British Parliament adopted the British Calendar Act of 1751, taking the eleven extra days from September (the British calendar became an additional day off in the year 1700, which British Calendars considered a leap year and Gregorian Calendars did not). The parts of American under the control of the British followed this change. The parts under control of France and Spain changed in 1582. Other places changed at other times. What a mess.

Along with dropping 11 days from September, the legal start of year, March-25, was moved to match the common practice of January-01. This made the stub year 1751 the shortest year on record: it consisted only of March-25 through December-31. It was only 282 days long

The second shortest year depends on the country you are in. Except for countries where the start of the year changed, such as England, the shortest year was during the Julian to Gregorian calendar change. The year length depended on how many days were dropped from that year. Corrections made during leap years usually resulted in an additional day being added to the year (normally 1752 would of been a leap year, but as this stub year did not have a February, it could not be a leap year). Changes made in 1582 resulted in years of 355 days. Changes in the 1700's made years 354 days long, or 355 days in leap years. Changes made in the 1800's produced 353/354 day years. Changes made in the 1900's or 2000's require years 352/353 days long.

The common practice of using the Pope's calendar in England before 1752 resulted in some confusion as to what year it was for the months of January, February, and part of March. To avoid confusion many colonial records used dates providing both the official and common year. A typical example is "5 Feb 1750/51".

After 11 days were dropped from 1752-September many people adjusted annual events based on previous dates, such as birthdays, to continue observing them a true year apart. This resulted in people writing both days and both years for some dates when referring to the "old style" dates. George Washington's birthday could be written as "11/22 Feb 1732/33".

The Year 2000 problem is not the first point of major confusion over calendars.

3.5 Other Calendars

Countries like Greece didn't adopt the Gregorian reform until years as late as 1923, when they had to drop 13 days. Whitaker's Almanac contains a complete section on the differing calendars in use around the world and throughout the ages. Claus Tøndering's Calendar FAQ makes the dates of the changeover for many countries available on the web.

The Gregorian Calendar is by no means universal. Some places, or religions, still use the old Julian Calendar while others have their own historic calendars that continue to be used to this day. There are many such calendars in use today... well into three digit values.

The concept of months appears to have originally been based on the moon's "monthly cycle" (synodic period) of about 354.367 days Early calendars based on the synodic month quickly came out of sync with the year as a lunar month is about 29.5309 days.

Today's Gregorian calendar is based only on the period of earth's year around the sun. However, the Islamic calendar still uses the moon's period and not the sun's.

Lunisolar calendars use both the lunar and solar cycles. Every few years a whole month is inserted into the year to bring the calendar back into line with the solar calendar. The Chinese and Hebrew calendars are modern examples of lunisoloar calendars.

When astronomers run dates backwards before year 1 BC they simply use straight number signs. Rather than use 1 BC they use year 0. The year 2 BC becomes year -1, etc. This helps keep the computers doing their calculations happy.

3.6 Time Of Day

3.6.1 New Days

When a new day starts is purely a local convention. While the current popular convention starts a new day at midnight, this has not always been the case, and is not the case everywhere even today.

The three other times to start a new day have been seen by this author. Morning, when the sun rises; noon, when the sun is high in the sky; and evening, when dusk starts. These are still in use today in different parts of the world, especially for religious or other special uses.

Noon is the historic start of day for the old Roman Empire. This author suspects that the decision to start a new day at noon had something to do with watching sundial shadows peak and fall at noon along with keeping all astrological observations at night, where the sky was much more interesting, in the same "day". What really is different between current and Roman days is that Romans tracked time with sundials using 12 hours of daylight and 12 hours of night. Therefore the length of a Roman "hour" was longer in summer and shorter in winter. Today's astronomers continue to use the noon-to-noon days when tracking astronomical days.

Rainfall and river flow in the UK are measured from 9am to 9am and ascribed to the day which contains the greater part of the time.

It's been rumored that some transport companies record days using 03:00 to 27:00 to allow customers going to their work day to be scheduled on the same transport day. This author would love to hear from anyone with definitive knowledge on the subject, historical or current.

3.6.2 Time Zones

How we track time also has changed. Long ago noon was when the sun was high in the sky, a purely local event. The local clock keeper of the town would ensure the clock matched the sun each day and everyone took their time from the official local clock. With the advent of faster travel, in particular the railroads, a need for a standard, predictable, time arose. Many long distances only had a single-track railroad and without electronic communications running trains strictly on schedule was vital to avoiding collisions.

In 1884 an international convention divided the world into twenty-four time zones. There is no international standard for the actual names or abbreviations of these zones. Each country is free to define its own names and abbreviations and RFC-822 defines its own set of names. Countries may declare their own compliance to time zones. Indeed some countries are 15, 30, and 45 minutes off of UTC. Historically even stranger offsets from UTC have been used. From 1909-May-01 through 1937-June-30 the Netherlands was exactly 19 minutes and 32.13 seconds ahead of UTC by law (such offsets can not be represented exactly in the ISO 8601 standard). Programs that need to process time zones must allow minute, even second, offsets. The time zones are not straight lines, but snake around to meet the needs of the local people.

Time zones center on the "prime meridian", which was originally defined as longitude zero, a line that bisected a critical part of the main telescope at the Royal Observatory at Greenwich in south-east London. All countries defined their time as some offset, positive or negative, away from "Greenwich Mean Time" (GMT). GMT is based on mean local time at the an agreed on zero longitude. This Mean Time at Greenwich England is not subject to daylight-saving rules (local Greenwich time, however, is subject to British daylight-saving time rules). The Americans are negative while Asia and most of Europe tend to be positive. See the National Institute of Standards and Technology Glossary for more information.

As this was basically an American and European convention, the problem of new days was dumped in the middle of the biggest body of water, the Pacific Ocean, twelve time zones away, on the other side of the world. This is the "International Date Line" and tends to be as far from different countries as possible. Crossing the International Date Line adds or subtracts a day from your local time, depending on the direction of travel. While written about by a few people for over a century, the first people this actually happened to, much to their surprise, were survivors of Magellan's crew on their 1522 return to Cape Verde Islands when the day turned out to be Thursday rather than the expected Wednesday.

The U.S. military, and various other international users, have set up a series of single-letter codes to represent different zone offsets.

M Y X V U T S R Q P O N Z A B C D E F G H I K L M

with "Z" ("Zulu") holding the prime meridian, or Zero offset. "Y" being "-11", "M" being "+12" (note the lack of the letters "J" and "W"). These letter codes do not work in countries with offsets not an even multiple of 60 minutes from Zulu (UTC) time.

RFC-822 attempted to implement these letter codes. While the text portion of the RFC described the offsets correctly, the sample numeric values had their signs reversed from the (Mil?) standard RFC-822 was attempting to model. Except for Z, which has an offset of Zero, the success of a program using RFC-822 would depend on what section of the RFC was used to write the code.

ISO 8601 only implements "Z".

3.6.3 Daylight Saving Time

"Saving time" is a concept that allows nations away from the equator to adjust their clocks by an hour to allow for the change of dawn and dusk that occurs with the change of seasons. "Spring forward and fall back" to set clocks forward in spring and backwards in fall, whenever your Spring and Autumn are for your area.

There are no international standards for "saving time". Unless bound by some agreement, each country is free to define its own rules on the subject.

3.6.4 Leap Seconds and UTC

In the 1970s the world standards switched from "GMT" time to Atomic Time, or "UTC" time (Coordinated Universal Time). GMT follows all movements of the earth and is slightly different each day. UTC is atomic and exactly the same each day. As the years go by GMT drifts from UTC. When this difference approaches 0.9 second the Bureau International des Poids et Mesures (International Bureau of Weights and Measures, a.k.a., BIPM) in France declare a day in the future that will have a "leap second" added (or subtracted) to prevent the drift between the two times from reaching a full second. These are effectively random intervals that can not be predicted very far in the future. Leap seconds are typically implemented on the last day of June or December, but may occur at the end of any quarter.

Leap seconds may be positive or negative, adding or subtracting a second on the last minute of a day. On a "leap second" day the day ends at the completion of 23:59:58 for negative leap seconds and after the completion of 23:59:60 for positive leap seconds. The next second is "00:00:00" of the next day, which is also known as 24:00:00.

At the time of this writing the last leap second was 1998-December-31 (1998-12-31) and none is scheduled through the end of 2003-December (2003-12). This makes an unusually long dry period for leap seconds, but that's how the world turns. This may last into, or perhaps beyond, 2006. Perhaps.

Leap Seconds are added (or subtracted) at the same time around the globe at 23:59:59 UTC, whatever the local time may be, regardless of any local time zones.

Date TIME Location
1998-12-3123:59:60 UTC
1999-01-0108:59:60 Melbourne Australia
1998-12-3117:59:60 Local Central USA time

Currently (2004-03) the ITU SRG 7A group that deliberates the future of UTC is considering replacing UTC and its leap seconds in the year 2022 with a new "Temps International" (TI) time when UTC is 50 years old. TI time is not linked with the earth's rotation and would require changing local time zone offsets every few centuries to keep within an hour of daylight times (leap hours?). Time zones maintaining daylight savings time could eliminate the extra hour at the end of savings time for that year's correction.

4. International Standards

4.1 Introduction

Several date and time standards are used on an international basis. A short summary of several standards used internationally by the computing community follow. To aid technical people, sample printf(), and perhaps other code, is provided. Normal people are expected to ignore this code!

4.2 ISO 8601 Standard

4.2.1 Introduction

The International Organization for Standardization has defined a standard named Data elements and interchange formats -- Information interchange -- Representation of dates and times used in a growing number of places throughout the world for writing dates and times. Numbered as standard 8601 in the ISO numbering sequence in use during 1988 it is commonly referred to as the "ISO 8601 standard". This standard extends, improves, and unifies earlier standards that date back to at least 1971 (e.g., ISO 2014, ISO 2015, ISO 2711, ISO 3307, and ISO 4031). ISO 8601 has been revised a number of times since its introduction.

In 1992 the European Committee For Standardization adopted ISO 8601 under the standard EN 28601 to end the traditional confusion involving periods, slashes, the order of the numbers, and other date formats formerly used throughout Europe.

Companies that wish to do business on a global scale are best served by avoiding date formats local to their own country. Distributing local date formats on a global basis tends to cause confusion to people in other countries. For web pages being read by world wide audiences, using ISO 8601 format dates and times seems the only sane way to present dates and times. One Internet poll decidedly favors using ISO 8601 format for date and times on web pages.

A growing number of individuals and companies are changing from writing dates in their traditional local format to using ISO 8601 compliant dates and times (e.g., yyyy-mm-dd hh:mm or yyyy-ooo hh:mm) in everyday documents. This is especially true for documents used in international trade. ISO 8601 week numbers are also increasingly being used to specify specific weeks.

The growing availability and popularity of low cost digital watches and the increasing use of computers is making the hh:mm and hh:mm:ss formats adopted by ISO 8601 for time well known throughout the world.

The United Nations Economic Commission for Europe, Working Party on Facilitation of International Trade Procedures, Recommendation 07, The Numerical Representation of Dates, Time and Periods of Time, is based on the ISO 8601 standard. Visit UN/ECE Trade Facilitation Recommendation 07 at http://www.unece.org/cefact/rec/rec07en.htm for more information.

NOTE: While the following interpretation of ISO 8601 should be sufficient for the vast majority of users, it is still only a summary. The actual standard is much longer, repeats many details, and should be consulted for full details. Major corporations or people developing applications particularly sensitive to date time standards should purchase an official copy of the standard rather than rely on comments or free draft copies.

Those who want more, but without the standard, may vist RFC-3339, which defines a date and time format for use in Internet protocols that is a profile of the ISO 8601 standard for representation of dates and times using the Gregorian calendar.

NOTE: Because ISO 8601 is so vast and complex many companies are issuing their own standard that follows only a selected subset of the full standard. In practice this tends to be use the yyyy-mm-dd format with an hh:mm or hh:mm:ss format for time. When more accuracy is needed, the hh:mm:ss,nnn format (note comma) allows whatever precision is appropriate to be used. Trailing zeros are always added to have all times use the same number of significant digits.

4.2.2 Date Formats

The ISO 8601 format is large and complex and sets forth many ways of representing dates and times. The three major forms of date and times defined by ISO 8601 follow. In practice only the first two formats enjoy general use.

yyyy-mm-ddTHH:MM:SS
yyyy-dddTHH:MM:SS
yyyy-Www-dTHH:MM:SS

Note that hyphens separate date fields and colons separate time fields. Dates using these characters are said to be in "extended format". The fields are:

yyyy-mm-dd
Year (usually four digits, but see "truncation", in section 4.2.3), month, and day. When present, month and day are written using two-digit numbers, with leading zeros as appropriate. Month is 01 through 12 and day is 01 through 31, as appropriate.
printf( "%04d-%02d-%02d", d.tm_year+1900, d.tm_mon+1, d.tm_day );

NOTE: As a partial transition to ISO 8601, a number of web sites use the month name or abbreviation in a date when changing their dates to use year, month, day formats. The month names January through December and the weekdays Monday through Sunday are acknowledged in 4.3.2.1 of ISO 8601:2000 for reference, but not for official use. Abbreviations of these names are not mentioned in the standard.

NOTE: the standard requires that years on or before 1582 be avoided except by mutual agreement. For practical efforts the year 1752 may provide a safer limit. The standard assumes the Gregorian calendar is run backwards as if it was always in use (a "proleptic" calendar). This is unlikely to be acceptable to many uses involving historic dates. Historians discussing early dates tend to use the Julian calendar, and for dates before 45 BC, a proleptic Julian calendar.

Thus proleptic years need to be considered an agreed upon, but imaginary, calendar when referring to dates before the Gregorian calendar correction came effective. Such dates need to be converted to the appropriate local date.

yyyy-ddd
Year and Ordinal Day number (001-365, 366 on leap years: always three digits).
printf( "%04d-%03d", d.tm_year+1900, d.tm_yday+1 );

yyyy-Www-d
Year and Week Number (01 to 52 or 53). The week number may be followed by an optional "day of week" number with 1 being Monday and 7 being Sunday.
printf( "%04d-W%02d-%d", d.tm_year+1900, Week, WeekDay );

T
The literal character T is used to separate the date from the time of day when combining dates and time. While an upper case "T" is the letter of choice a lower case "t" may be used if the upper case "T" is not available in the character set. See "Truncation and Reduction", in section 4.2.3, for details of omitting the T from dates and times.

If only a time field is present, a T may be placed before it to identify it as a time: T010203

While replacing the "T" with a space makes the date and time much easier for human eyes, and appears to be a very common practice, technically it is a violation of strict ISO 8601 notation (see ISO 8601:2000 4.4). This is true for spaces embedded anywhere within a date or time. However,

While the original scope for ISO 8601 was for computer to computer "Information Interchange", ISO 8601 is now being used in many areas the original standards committee never considered when the standard was designed. Thus an out of specification space character is acceptable for displaying dates for people as computer to human transfers are not strictly within ISO 8601's scope. Even on computer transfers this date can be made in scope if the date and time are considered two separate, but related, values.

Suggestions from this author for character format dates and times: Use the "T" when the prime audience for dates and times are other software packages and space for dates and times read by humans. When using spaces, ensure you word the specification to state/imply that separate date and time fields are being used rather than a single date and time field to avoid being accused of being an idiot by ISO 8601 purists.

HH:MM:SS
time stamp, in a 24-hour clock. Typically each portion of the time is written using two-digit numbers, using leading zeros as needed.
printf( "%02d:%02d:%02d", d.tm_hour, d.tm_min, d.tm_sec );

Fractional time can be represented using decimal notations after the HH, MM, or SS. Typically after the SS. The character ISO 8601 prefers to use for the decimal character is a comma (,). The only other choice is a period (".", e.g., "full stops"). Two digits will always be found to the left of the decimal sign. Once a fraction is started, subsequent MM or SS fields, if any, must be omitted (e.g., 15,5:30 is invalid).
printf( "%02d:%02d:%02d,%03d", d.tm_hour, d.tm_min, d.tm_sec, (int)ftime_time.millitm );
printf( "%02d:%02d:%06.3f", d.tm_hour, d.tm_min, FloatSecond );

(NOTE: do not use %g notation!)

4.2.3 Additional Details

This is where the details most casual users of ISO 8601 will need are found. Anyone wanting to do more with ISO 8601, or wishes it did more, should look through this section for the variations and the smaller details of ISO 8601. There has been an attempt to order this information to most practical at the start to the more obscure at the bottom.

More Information: There is quite a lot more to ISO 8601, some of which is covered in this section. See the standard if you need serious or legal details. This author maintains a list of locations copies of ISO 8601 may be accessed at in ISO 8601 Commentary Links. The list is kept list outside this document due to the constantly changing nature of the list.

Leap years are years evenly divisible by 4, unless the year is evenly divisible by 100. However years evenly divisible by 400 remain leap years. ISO 8601 considers the proleptic Gregorian year "0000" a leap year

See Gregorian Calendar and Leap Years", 3.3, for the history of Leap Years.

Leap seconds add or subtract a second to a day. Positive leap seconds added to a day are denoted 23:59:60 and occur at day it is on. This explicit ISO definition of 24:00 being 00:00 of the next day is not well known.

If 24:00 must be used, only use 24:00 as an ending time. Events using 24:00 should not actually be in, billed to, cross over, or otherwise use, midnight. Rather, treat 24:00 as an instant after the completion of the current day (instant after 23:59.59.9999999...).

Basic versus Extended Formats: The "basic" format provides the minimum characters needed for the desired precision (e.g., 19991231T235900). The "extended" format adds additional separator characters, by specific rules, to make fields easier to read (e.g., 1999-12-31T23:59:00).

Expanded Formats By mutual agreement of all parties the number of digits in a year may be expanded to record years with values greater than 9999. Further, a plus (+) or minus sign (-) must precede the year's value to indicate if the year is A.D., which uses plus, or B.C., which uses minus. The use of a sign appears to be required, even if only A.D. (positive) years are being used.

This expanded year format may be used in any year field. The number of digits in a year must be part of the mutual agreement.

Great caution needs to be observed when using expanded formats as ISO 8601 requires the Gregorian calendar to be used for all specifications. The break between Julian and Gregorian is not acknowledged by ISO 8601. This makes early dates, especially those before the local calendar conversion, suspect. It is common practice by historians to use Julian calendars before 1582. See the notes in Date Formats, 4.2.2, about "proleptic" calendars for more details about this problem.

Week Numbers: ISO 8601 also allows for week numbers to be tracked using a yyyy-Www notation rather than using month and day numbers. Week "01" is defined as being the first Monday through Sunday week that has a Thursday in it (author's hint: weeks containing January-04 are always week 01).

Day numbers in a week (ISO ordinal day numbers) start at 1 for Monday and run through 7 for Sunday. There are always three days before Thursday and three days after Thursday in ISO weeks.
printf( "%04d-W%02d", d.tm_year+1900 );
printf( "%04d-W%02d-%1d", d.tm_year+1900, WeekNumber, DayInWeek );

Notes:

In the following tables the left table shows every possible ending week for an ISO 8601 year. Each row in this table represents one of the possible years. The corresponding row of the right-hand table shows the first week for the following ISO 8601 year.

Last week of prior year (December)
MTuWThFStSu
22232425262728
23242526272829
24252627282930
25262728293031
262728293031 1
2728293031 1 2
28293031 1 2 3



==>  
First week of following year (January)
MTuWThFStSu
293031 1 2 3 4
3031 1 2 3 4 5
31 1 2 3 4 5 6
 1 2 3 4 5 6 7
 2 3 4 5 6 7 8
 3 4 5 6 7 8 9
 4 5 6 7 8 910

Examples: the date 2001-Dec-31 (a Monday) is considered to be a part of the first week of 2002 in the ISO week calendar. This is written as 2002-W01-1 (that is "Year 2002, Week 01, Day 1"), even though in the Gregorian calendar it is actually the last day of 2001. The next day, 2002-Jan-01 (Tuesday), is therefore 2002-W01-2 ('Year 2002, Week 01, Day 2').

Time Intervals: ISO 8601 also allows intervals between two points in time to be specified. These intervals may be associated with actual time points or they may be more abstract without being associated with specific time points. Intervals may include any valid combination of dates and/or times. Recurring intervals may also be specified.
 

  1. Intervals not associated with an actual time point are represented using a series of numbers followed by designators specifying the appropriate period. Valid designators are years (Y), months (M), days (D), hours (H), minutes (M), seconds (S), and weeks (W).
     
     
  2. Starting and stopping date/times are separated from each other by a slash character ("/", a.k.a., solidus).
    1999-12-31T23:59:59/2000-02-29T24:00:00
  3. Styles 1 and 2 may be combined to assign a specific starting or ending time to the period.
    1999-12-31T23:59:59/P0Y1M29DT1S
    P0Y1M29DT1S/2002-02-29T24:00:00
  4. The alternate format of Pyyyy-mm-ddThh:mm:ss may be used by mutual agreement by both parties to specify periods of time. Note the P prefix. A period of 4 years, 5 months, 6 days, and 12 hours follows:
    P0004-05-06T12
    P0004-156T12
    P0004-W22-2T12
    Full truncation and reduction seem to be allowed with this alternative format.
     
  5. Recurring periods of time may be specified by preceding one of the preceding period formats with a R9/ style prefix that specifies the recurrence count (9 in this example).
    R8/P1Y
    R6/P15W
    R0/P12D
    R1/PT12H

    The ISO 8601 standard also allows R to be specified with intervals providing a specific time point:

    R4/1999-12-31/2000-12-31

    This format identifies the duration of the first period (1999-12-31/2000-12-31) along with the number of times it is to be repeated (four times: R4). The above sample is the same as R4/1999-12-31/P1Y.

Time zones: Time zones may be specified after times by placing a time zone indicator after the time. A zone indicator of Z represents zero offset for UTC (also called "Zulu" time some places outside of ISO 8601). A notation of =hhmm (basic format, =hh:mm extended format), where "=" is a sign ("+" or "-") and hh mm an interval, provides the hours, and optional minutes, offset from UTC. See section 3.6.2, Time Zones for details on time zones.

Tip for times intended for use internally by software or in an international distribution by people. Rather than encode time zones into a time, simply convert the time to UTC and provide a zone name of Z. A zone of UTC may be used for documents read by people as the average person will not understand Z. Let the readers convert UTC to their local time. Astronomers and UNIX systems have been using UTC for years. For software developers the zoneinfo files provide a standard way of manipulating time zones. In international distribution time zone names can be near useless and numeric offsets more confusing than a simple UTC time. Note that this document uses such a revision time in its heading area.

ISO standard 9945-1 has established a series of standard names for time zones. While this is an ISO standard, it is not used by ISO 8601 in any way. The ISO 9945-1 names use longer words rather than three-character abbreviations, and would not be understood as zones by average readers. This standard seems to specify a standard way of specifying time locales to operating systems rather than time zones to people. The zoneinfo files appear to use them. Information on ISO 9945-1 time zone names may be found by visiting:

http://www.bsdi.com/date/
Truncation and Reduction: ISO 8601 allows you to reduce the precision used to recorded dates and times. The general rule for ISO 8601 is to properly replace leading fields with hyphens if their values are not desired. While "truncation" must be strictly done in a left to right manner, ISO 8601 rules for these extra hyphens have a few quirks in them. See the following table.

Low order fields may simply be dropped from right to left (reduced) without using any special characters.

Examples of truncations and reductions follow. Rows in italic type note conditions of special interest. People in a hurry should read these even if they skip the others. Truncation requires mutual consent by all involved parties. A format that ensures that all truncated values can be uniquely recovered is essential. Where there is no risk of confusion users may also agree to omit unwanted leading hyphens (again, see the table for quirks). Truncation of century digits of years have been a fertile source of confusion. Caveat Truncator.

Date / TimeExamplesTruncatedReduction
  Extended      Basic    
1999-12-311999-12-3119991231  
99-12-31991231Century[1] 
--12-31--1231Full year 
n/a--12Full yearDay
n/a---31Full year and month 
n/a1999-12 Day[2]
-99-12-9912Century[1]Day
n/a-99CenturyMonth and day
n/a1999 Month and day
n/a19 All but century
1999-3651999365  
99-36599365Century[1] 
n/a-365Century and year[3] 
1999-W52-51999W525  
99-W5299W52CenturyDay in week
-W52-5W525Full year 
-W52W52Full yearDay in week
n/aW-5All but day in week 
23:59:0023:59:00235900  
-59:00-5900Hour 
n/a--00Hour and minute 
23:592359 Seconds
n/a-59HourSecond
n/a23 Minutes and seconds
23:59:00,153n/a--00,153[4]Hour and minute 

  1. Section 4.6 of ISO 8601:2000 covers "truncation". Section 5.2.1.3 of ISO 8601:2000 expands on truncated representations. By agreement the leading hyphen may be omitted when truncating only the century providing the day number is not also being dropped by reduced precision. Once any other portion of the date is omitted by reduction, the leading hyphens to indicate omitted fields, especially omitted century digits, seem to be required.

    The sections on truncation have wording that has generated considerable confusion and discussion. Archives of some discussions can be found at:

    http://groups.yahoo.com/group/ISO8601/message/189
    http://groups.yahoo.com/group/ISO8601/message/197

    Earlier versions of the standard had points covering truncation of century digits that this author found particularly confusing. Readers of earlier versions of this document based on the earlier ISO standard may observe major changes to century truncation rules.

    When negative or expanded years are possible, this author will never want to truncate a year.

  2. When only the day is omitted, a separator must be placed between the year and month. This is the only place basic format uses a hyphen separator. because the six digits, "121212", are to be read as 12-12-12 (YYMMDD), not 1212-12 (YYYYMM).

  3. While --365 could be used (double hyphens), the first hyphen is redundant and therefore omitted.

  4. The comma or full stop (period) between the whole and fractional digits is not a separator character as hyphens and colons are. Rather the whole and fractional portion of the number make a single field.

    In this sample the agreed precision has added 1/1000'th of a second to the time. Trailing zeros are required to keep the fraction the same number of digits in each recorded time.

The T between the date and time may be omitted in most cases. If a T is present it must be followed by a time. If only a time is given, the T may precede it to better indicate it is a time. The T must be present in truncated dates with reduced times.

4.3 RFC-822 Date Standard

4.3.1 Introduction

Another date and time standard frequently used within the computing community, especially the Internet community, is included in RFC-822, which provides standards for date and time used on Internet type networks. RFC-822 dates are in the format "Tue, 16 Feb 99 17:56:23 EST".

RFC-822 is no longer used for new development as it has been replaced by RFC-2822. Though, when many people casually refer to "RFC-822" they are really refering to RFC-2822. This author believes some reasons RFC-822 fell out of favor include its inherent Year 2000 problems, "English Language" flavor, time zone problems, and the introduction of ISO 8601). A short summary follows. And if you find something called RFC-822 that doesn't look quite like this, check out RFC-2822 format.

4.3.2 Date Format

[ Day, ] dd Mmm yy HH:MM[:SS] zone

These fields are:

Day
Optional day of week (e.g., Sun, Mon, Tue, Wed, Thu, Fri, Sat), followed by a comma (tip to computer programmers: consider the comma optional when reading dates, but always generate them if the day of week is present). Always in English.

dd Mmm yyyy
Date. E.g., 07 Nov 99.
This standard always uses a two-digit year. In practice this should be interpreted as the closest year to the current date. As RFC-822 is concerned with the transmission of information over networks there is no need to track periods over 99 years in length.
printf( "%2d %3s %02d", d.tm_mday, MonthName[d.tm_mon], d.tm_year%100 );

HH:MM:SS
time stamp, in a 24-hour clock. Seconds is optional.
printf( "%02d:%02d:%02d", d.tm_hour, d.tm_min, d.tm_sec );
printf( "%02d:%02d", d.tm_hour, d.tm_min );

zone Time zone information.
Provides offset from UTC to local time in the standard [+-]hhmm format or one of several standard names.

IMPORTANT NOTE: while RFC-822 defines one-letter time zone codes as described in section 3.6.2, the signs of the numeric offsets were reversed. Thus programs should avoid the one-letter codes with RFC-822 time zone offsets as the receiving program may process it exactly the opposite way you expect. Z (zero) is the only reliable code.

Note that years are always two-digits and must be century corrected when read.

4.3b RFC-2822 Internet Message Format Date Standard

4.3b.1 Introduction

RFC-2822 specifies a syntax for text messages that are sent between computer users, within the framework of "electronic mail" messages. Section 3.3 of the standard specifies the formats of date and times used internally within E-mail messages.

RFC-2822 is a big improvement over RFC-822, but still not as good as ISO 8601.

4.3b.2 Date Format

[ day_of_week, ] Day Month Year HH:MM[:SS] Zone

These fields are:

day_of_week
Weekday name (Mon / Tue / Wed / Thu / Fri / Sat / Sun). If present, this name must be followed by a comma. Software developers should see the note in RFC-822 about this comma.

Day
One or two digit day.

Month
Month name (Jan / Feb / Mar / Apr / May / Jun / Jul / Aug / Sep / Oct / Nov / Dec).

Year
Year: four digit year.

HH:MM:SS
Time of day. Each field being two digits in length. The :SS field providing seconds is optional.

Zone
Time zone offset from UTC. In the form of a sign character (+/-) followed by a four digit time providing hours and minutes of the offset. A colon is not used in this field.

A typical RFC-2822 value follows:

Wed, 18 Jul 2001 11:54:46 -0400

NOTE: the timestamp placed on the initial "From " line of mail box files ritten by the sendmail program are in the format "Mon Mar 15 10:25:32 1999", which does not follow RFC-822 or RFC-2822. They are generated by calls to the asctime() function. As this date is not sent over the network it does not need to comply with RFC-[2]822.

4.3b.3 Obsolete Format

RFC-2822 also acknowledges an "obsolete" format that should not be generated by modern programs, though they may need to parse it if received.

While the field order is the same the following fields have different values:

year
Two digit value. Values of 00 through 49 represent 2000 through 2049 and values of 50 through 99 represent 1950 through 1999. Values of 100 and beyond represent year 2000 and beyond. Please note that this date window of 49/50 is different from the standard UNIX date window of 68/69.
zone
A value of GMT or UT for UTC, otherwise a series of values representing different time zones. EST / EDT / CST / CDT / MST / MTD / PST / PDT / Z

Like RFC-822, all single letter time zones, except Z, are untrustworthy.

Some E-mail messages may use other time zone formats.

4.4 Julian Days versus Ordinal Calendars

The concept of Julian Days tracks the number of days that have elapsed since noon of 4713-January-01 BC (Julian Day zero). Julian day one started at noon of January-02. Starting days at noon is a tradition the Julian calendar continues to this day. There is no concept of "years" in this calendar.

This dating system is not to be confused with the "Ordinal Date" calendar, often, but incorrectly, also called the Julian Calendar. This calendar tracks years like the Gregorian calendar, but uses the day number within the year: day 1 to 365 (366 in leap years), for the day. There is no concept of months.

Julian Day one was determined by combining three commonly used cycles of 28 (solar cycle), 19 (Golden Numbers) and 15 (indiction cycle) years to obtain a larger cycle of 7980 years, which is the least common multiple of the cycles. Traditionally credited to Justus Scaliger (1540-08-05, France) it was also discovered earlier by others. Running the Julian Calendar backwards from the current date to the start of the current "great cycle" avoided the need for negative numbers for years along with providing a uniform way of measuring dates and times during a period of accumulating calendar errors.

Please visit http://hermetic.magnet.ch/cal_stud/jdn.htm for more information about Julian Days.

Today Julian Days are used in various scientific or other extended calculations. As long as you are willing to ignore the 1582 (1752 in Britain and its colonies) calendar correction, staying within recent years, it is surprisingly easy to transform Julian days to "Gregorian" calendar dates.

Chronological Julian Days CJDs shift the start of a day backwards 12 hours to match our current concept for the start of each day and follow local savings time conventions. Thus the CJD day zero started at Julian 4713-January-01 00:00:00 BC. When most people refer to "Julian Days" they are referring to the chronological kind. The more casual the reference the more likely it is the author did not even know of noon-to-noon days. Ignoring savings time and calendar corrections:

ChronologicalJulianDay = JulianDay + 0.5
JulianDay = ChronologicalJulianDay - 0.5

A Modified Julian Day shifts midnight like Chronological Julian Days do, but also adjusts day numbers to start at 1858-November-17 to obtain a date no more than five-digits in length. This is accomplished by subtracting 2,400,000.5 from the JD to produce MJD one (not zero). The resulting day numbers are more manageable to work with when using large numbers of dates. MJDs strictly follow UTC and are not subject to savings time or calendar correction events. MJDs are widely used in the scientific world for logging events.

IMPORTANT QUESTION, ALERT, and QUESTION
MJD values have been five digits in length since 1886-April-03 and a great many date MJD transmission standards expect and require five digits of MJD. Yet the authoritative definitions I have been able to find all define MJD is JulianDay - 2,400,000.5 and not ( JulianDay - 0.5 ) modulo 100,000. On Sunday 2131-August-31 the Julian Days will reach 2,500,000, overflowing five digits. This presents following questions:
  • Who "Owns" (created?) the MJD definition? It seems to be have created in 1975. I would love to hear from anyone out there professionally involved in the event!
  • Are MJDs guaranteed to be 5-digits, and if so, what is the authority? (descriptions of data transfer formats, regardless of author, is NOT likely to be a true authority).
  • Just how many programs are going to break on this date?

This reminds me of C's struct tm tm_year field overflowing to 100 in year 2000.

Truncated Julian Days are also in use. This is the MJD truncated to four digits (remainder of MJD divided by 10,000) producing a more manageable four-digit date. The cycle returns to zero about every 27.4 years. The first TJD "day zero" was 1968-May-24 (a Friday). The second TJD cycle started on 1995-Oct-10. The third cycle will start on 2023-February-25. Computer programs using TJDs must take all due care to avoid problems when the cycle wraps. TJDs were created by NASA during the times of Apollo moon launches and have been picked up for use in other areas.

4.5 When Did The Third Millennium Begin?

The previous Millennium began in 1001. The current Millennium begain in the year 2001 rather than the year 2000. By this time the only part remaining of any business that had not solved its Y2K problems was likely to be the lawsuits against it. Thus "Year 2000" (or "Y2K") is favored by this author for the popular "Y2K" bug rather than the term "millennium problem" (NOTE: Millennium Bug is trademarked). "Y2K" covers all problems that occur in the year-2000, not just the two-digit year problem the popular press would have you believe is the only bug out there.

Centuries follow the same conventions millennia do. We need a "20" in our century digits before we can close out the 20th century.

If you are still not convinced, try to think in Roman numerals like the early calendar designers did. The first year was year I. Each series of numbers starts with a single I: I through V, VI through X, XI through D, DI through C, and so on. Therefore the last year of the first century was year C. In the "natural" order of Roman Numerals a single I is needed in the first year of a new century. Therefore year M ended the first millennium and MM ends the second millennium, with year MMI starting the third.

The first year of the Christian Era was dated year I and was preceded by year I B.C.. While some calendar calculation formulas consider year "0" to be what is today commonly called "1 BC", the official convention for publication is still "1 BC". Remember, the Romans did not use place-holder mathematics like we do and therefore did not have a number for zero. Roman numerals were still being used in the medieval times our current dating system was invented in.

However, decades seem to start when the one's digit returns to zero.

5. Miscellaneous Things

5.1a Date Numerology

Before Arabic numbers (0 through 9) came into use many cultures used letters for both words and numbers. This lead to implying numbers that spelled important things, such as God, gods, rulers, and more, having special significance. Some numbers were "adjusted" when written to avoid profane use of God's name.

While people still wring floods of "facts" out of numbers you won't read them here. However, a few fun tidbits follow.

5.1b Odd Day

1999-11-19 (1999-November-19) was a Gregorian "Odd Day" (every digit is odd). This was the last one until 3111-11-11, which is beyond anything we, or our programs, will ever see (for you digit-splitters out there... 1/1/3111 can also be considered a slightly earlier odd day... please delay any flames to me on the subject until 3110-12-01, when we are assured the Gregorian calendar will still be in use on this date.)

The first "Even Day" after the Y2K roll over was 2000-02-02, the first since 888-08-28 (sic.). The date of 01-01-01 was also numerically interesting.

5.2 Julian! Julian! Who Art Thou, Julian?

When it comes to subject of dates the name Julian is rather overloaded. It is hoped the following table should bring a little more order to the subject. The following Julians are ordered by date of birth.

The author wishes to thank Dr John Stockton and his contribution at http://www.merlyn.demon.co.uk/moredate.htm#Jul for this information.

6. Things That Go "00" In The Year [Y2K]

6.1 Often Overlooked Problems [Y2K]

The scope of Y2K problems is very large, not just those getting popular press. Major areas include:

Many of the "solutions" used by programmers to keep programs living in an after the year 2000 did not truly fix the problems. They only put the software on life support systems that only pushed the bug back a number of years. Companies would be well advised to determine if they have any of these time bombs lurking in their corporate code. Finding near term problems may be of critical interest. The sections on Y2K repairs should help developers spot such fixes and determine the life span of the "repair".

6.2 Aggressive Two-Digit Years [Y2K]

COBOL, RPG, and some other languages naturally allow programmers to explicitly declare variables are to be kept as two-digit values. These uses are very aggressive in keeping two-digit values as two-digits. Always and forever. Many other languages can do the same thing, though it may take more effort. Anything that uses "BCD" (Binary-Coded Decimal) to hold two-digit years must be considered guilty until proven innocent.

In these rules 99+1 yields not 100, but "00", the low-order two digits of the value. The carry of "1" that makes 100 is lost.

This wrap back to zero adversely impacts all date calculations and tests. There are many variations on the theme that depend on the programming language, operating system, and applications the program makes use of. Some sample problems include:

6.3 Other Bad Years [Y2K]

Sadly the scope of Y2K problems were not limited to two-digit years. Even modern programs in modern languages, such as C, perl, and Java, can have problems with year 2000. This author has personally seen bad code in new programs written as late as 1998, and will not be surprised if he finds some problems in 1999 code. Section 8, Bad Coding Techniques, goes into the technical details.

6.4 Date Setbacks: The Ugly Y2K Work Around [Y2K]

A well-known work around exists for operations where it is not vital for the computer to have the correct date. This is may work for industrial, and even many office, applications that are not networked or are sufficiently isolated from other major networks. If your operating system allows it (sorry Microsoft fans), set the date back 28 years. This returns to a previous point in the calendar cycle where the days of the weeks, leap year calculations, etc., is the same as the current year. The current 28-year cycle is valid for the years 1901 through 2099.

If you "setback" the dates in your database 28 years, and manually subtract 28 years from all dates entered into your software, etc., you may be able to continue using old software with a "cosmetic problem" of the wrong year being printed on reports. This is actually being done in places until they properly fix their problems. (Do you really care if, say the power companies computers, believe the year 2000 was 1972 as long as you got power in 2000?)
Note: some businesses may not be able to do this for legal reasons if the reports are legal records, etc.

Any date setbacks should have been started months before the year 2000 to debug them before the panic of 2000.

Using date setbacks have their own risks. Most of the problems, and defenses, involved in (9) Testing Applications By Changing CPU Dates also apply to date setbacks. This includes, but is not limited to,

NOTE: the Network Time Protocol software generally ignores years transmitted in the type packets, assuming the CPU's local clock has been properly tracking the current year. Any times obtained from other systems with NTP may therefore preserve the setback, advancing to the next year as appropriate. Test your system for this if you rely on NTP.

7. Native Computer Time Formats

7.1 Introduction

In order to better understand how programmers go wrong with dates and times in software, and improve the chances of writing correct time code in the future, we first need to understand the natural way that software uses time. Different operating systems and languages can have their own favored time formats.

7.2 History Of Computer Dates and Time

By the 1890's punched cards were starting to be used in "data processing". By the early 1900's punched cards were being widely used in the business world with "Electronic Accounting Machines" (EAM). These EAM machines were much dumber than today's computers and were controlled not by programs, but by "plug boards". Plastic boards with many holes in them into which wires were plugged, or mechanical tabs placed into, to select what was to be done with each column of incoming cards. The standard card evolved to have only 80 columns for information in it. As it was much easier to keep everything on a single card, long information, such as names and dates, were typically compressed as much as possible. Specially made file cabinets, specially made to hold "Hollerith cards", were used for the storage of information. Rows and rows of them. Sometimes buildings full of them. Warehouses of them. The automatic processing of two-digit years really started with EAM machines when "computers" were people who sat behind desks doing complex mathematical calculations all day long.

At the dawn of age of business computers, computers simply took over the reading of Hollerith cards. To make switching to computers as easy as possible, most companies kept the same card format for the computes as they did with their EAM machines. Back then computer memory was very expensive, well exceeding $1 (U.S.) a byte. Add to this the huge amount of labor required for changing the existing programs using the files and retraining of personnel you get a huge amount of money. Spice this total with sheer corporate inertia and you get a recipe for sticking with two-digit years long after card systems vanished at a company. The designs of some languages, such as COBOL, made this very easy to do.

These early computers did not have any concept of time. At that time if a program needed to wait a specific amount of time, the programmer would send the computer into a short loop of instructions that did nothing but waste time. How long the computer stayed in the loop determined how long the wait was. Typically the computer could not do anything useful while in the time wasting loop. Moving the program to a faster computer could make the waits unacceptably shorter.

The next trick this author is aware of was adding "real time" timers to the computer to allow software to track how long various processes the computer was performing or watching took. The timers also allowed computers to track time of day and dates. As these early computers were slow, the timers tracking "wall clock" times were often driven by the frequency of the incoming line power. This made them independent of the inconsistent clock frequencies in the computer's hardware. Sixty-hertz became the standard in the United States. Fifty-hertz in much of Europe.

The next step was to add quartz timers to computers to split times into much finer fractions. Use of batteries allowed computers to keep time when powered off. (As the power companies spends millions of dollars to ensure the long term accuracy of line power is very good, even to this day, without special software adjustments, most modern quartz crystal clocks will drift from true time faster than the old-style line frequency timers.)

7.3 Native UNIX Time Formats

7.3.1 time_t values

The native time for UNIX is kept in variables of type "time_t". Historically these have been "signed long" values that record the time of day, in seconds, from an "epoch" of 1970-January-01. Traditional 31-bit values are good until about 2038-January-19 03:14:07 UTC (ignoring future leap seconds). Two solutions for extending this is to use 32-bit unsigned values (2106-02-07 Sun 06:28:15 GMT) or extending time_t to 63-bit signed values. Any of these formats go well beyond the year 2038 limit of signed time_t. The 63-bit format goes well beyond the time the Earth's sun is expected to swell into a red giant making life impossible on Earth. In fact it allows for values 13 times beyond the age of the universe. This author believes it would be nice if a new time type was defined, say time_j, with an epoch of Julian Day Zero as described in section 4.4, or perhaps even go back to the beginning of the universe for time zero with a time of time_a (absolute time). In either case picking your zero can be tricky, but I will not bore those who do not ask with my thoughts on this subject.

While people may look at 2038 and say, "that is 36 years away, why worry?", this is what people said in the early 1900s when setting up EAM machines to use two-digit years. Even if the machines and software do not survive, data formats have proven to have much longer life spans.

The time_t values are "binary" time values. As such they are "Year 2000 safe", at least until programmers start doing something dangerous with them, like dividing or taking remainders of divisions. Addition and subtraction for small values are generally safe.

7.3.2 2038 time_t bug

A small perl program to demonstrate the "2038 Problem" follows that writes two time_t values when run on UNIX systems. The last valid UTC time for time_t is shown first. This is followed by one second beyond the last valid UTC value to demonstrate "negative" time values (Note to C programmers: perl simply calls the corresponding C functions, returning the appropriate values in a perlish way, so perl's time_t value nicely follows your local C's time_t problem). Your local system may express the bug differently.

#!/usr/bin/perl
### demonstration of UNIX "2038 Bug"
$EndOfTime = 0x7FFFFFFF; #last value valid as +signed 32-bit
$Format = '%04d-%02d-%02d %02d:%02d:%02dZ' . "\n";
@Gm = gmtime( $EndOfTime );     #break down last moment
printf $Format, $Gm[5]+1900,$Gm[4]+1,$Gm[3],@Gm[2,1,0];
@Gm = gmtime( $EndOfTime + 1 ); #beyond last valid moment
printf $Format, $Gm[5]+1900,$Gm[4]+1,$Gm[3],@Gm[2,1,0];

2038-01-19 03:14:07Z
1901-12-13 20:45:52Z

Leap seconds (section 3.6.4) may make some small adjustment to the actual time the bug expresses itself.

7.3.3 Bad Casting

Some programmers cast (i.e., copy) time_t values to "long" or other data types not properly compatible with time values. This is a cavalier assumption that can not only limit the portability of the program to different environments, it can also cause the code to mysteriously break with future compiler enhancements.

7.3.4 time_t And Time zones

As this time is kept in UTC (formally GMT) it has no time zone information in it for users in other time zones. Requests for local times need to apply the appropriate offset to time_t values. When localtime() translates early values of (time_t), such as zero, into local time with negative offsets, the time zone may back up a day into 1969-December-31. As -1 is the error return for time() and other functions, you get years of "69" with a minute and second of 59:59 popping up from time to time if the programmer does not check for error returns.

7.4 Native C Time Formats

C actually has two native time formats: time_t, as in UNIX, and the "tm" structure.

7.5 Beyond UNIX and C

These time formats have been adopted for use in operating systems and languages other than UNIX and C. Thus the strengths and weakness of these formats are available to many programs that have never been in C nor near UNIX systems.

Other operating systems track time in other ways. Nothing is universal.

7.6 IBM* PC, and compatibles, Time

Real Time Clock (RTC): the RTC is the primary clock on IBM PCs and their clones and reports times to the second. Though it is often embedded on the motherboard these days the RTC is a distinct hardware item of the computer. The RTC is initially set by the BIOS. Indeed, the BIOS and RTC can share a common, if small, area of special RAM that the BIOS battery preserves when the computer is powered off.

Interval Timers: as a granularity of a whole second is not sufficient for most operating systems the common practice is for an operating system (Windows, DOS, Linux, etc) to read the clock very carefully during boot. Before reading the RTC the OS will have previously started a higher resolution timer that can be used to track elapsed seconds. Such timers are often called "interval timers" and are used after the RTC is read to track the time of day with resolutions much higher than the RTC provides.

Different operating systems make different use of the RTC and interval timers. See your local operating system for details. Indeed, changing the time on some operating systems only updates the interval timer time and not the RTC. If the system keeps reverting to the old time on each boot this may be what is happening. Either find the command that sets the RTC or set the time from the BIOS on the next boot.

7.7 Microsoft Times

7.7.1 Microsoft Times

Microsoft products use times based on some epoch, different products using different epochs. Some typical products and their epoch time follows:
Epoch Year Microsoft Product Pivot Year Special Notes
0001 Microsoft .NET Framework Class Lib n/a 100nsec ticks since 0001-January-01 A.D.
1601 Microsoft NT/2000 (Win32) n/a 100nsec ticks since 1601-January-01 for a range of about 29,200 years
1899 Visual C++ DATE n/a 8-byte floating point value providing days since 1899-December-30
1900 Excel dates for Windows. 19/20 (cell entry)
DATE function n/a
VT_DATE: days since 1900-January-01 (which has a value of 2... buggy programs accepting 1900-February-29 start at 1).
Q95948 Q215094
1904 Excel dates for Macintosh. 19/20 (cell entry) Days since 1904-January-02 (value 1, see all notes for Windows Excel).
Switch between Mac. and Windows epochs by using Tools, Options, Calculation, and selecting or clearing the "1904 date system" checkbox.
1980 MS-DOS{MarkR12b} FAT File Systems and DATE command. 00/99  
1980 Windows 3.11 n/a  

Other products may use different epochs. Please E-mail me epochs for any major Microsoft product not listed here.

7.7.2 Microsoft 100ns Ticks

The "NT/2000" version, rooted in the year 1601, is actually available throughout the entire Win32 product line, including Consumer Windows (95/98/ME) and XP.

While the time format ostensibly has a resolution of 100 ns, no Win32 platform actually keeps time at that resolution. Windows 2000's clock is only good to 1/64th of a second (15.625 ms), the length of a clock tick. Other Win32 platforms have clock ticks of different lengths, and some can even change the length of a tick on the fly (!).

The Win32 API provides two functions for retrieving the current time, GetSystemTime() and GetSystemTimeAsFileTime(). The former retrieves the current time broken down into a SYSTEMTIME structure (analogous to struct tm), and the latter retrieves it as a 64-bit integer stuffed into two DWORDs in a FILETIME struct. (With the latter function, you can pass it an __int64 variable, with a typecast.)

However, when operating on a 64-bit timestamp, careless mixing of 32- and 64-bit quantities in a single expression in C/C++ can lead to the truncation of the 64-bit timestamp to 32 bits. There are a few easy things you can do to avoid the truncation, however:

While this may smack of a "belt-and-suspenders" approach to programming, this causes endless problems to people who are not careful.

If your code needs to be acceptable to more compilers than just Microsoft Visual C++, the following preprocessor incantations will probably be useful:


#if defined(WIN32)
#define _64(x) x ## i64
typedef __int64 int64;
typedef unsigned __int64 uint64;
#elif defined(__GNUC__)
#define _64(x) x ## LL
typedef long long int int64;
typedef unsigned long long int uint64;
#else
#error woops, need to fix _64() for this compiler!
#endif

7.8 Mainframe Time

IBM S/360 and ESA architectures track time using seconds since 1900 using a 64-bit fixed point value obtained from a TOD clock (Time Of Day). The high order 32-bits contain the time in seconds while the low-order 32-bits contain a fractional time. The number of fractional bits that are actually used is model dependent.

The kicker is that the TOD is incremented at some decimal increment, such as every millisecond, rather than a true binary fraction. The result is a "long second" where the second is incremented once every 1.048576 seconds.

Individual programs and languages that run on IBM mainframes have large libraries of time and date options. See specific programs and languages for individual details. The IBM publication The Year 2000 and Two-Digit Dates - A Guide to Planning and Implementation GC28-1251-05, may help mainframe users get started on time and date related issues. An unofficial copy was available at http://ano2000.kpnqwest.pt/ibm/best_ibm.pdf in PDF format.

Leap seconds (section 3.6.4) may make some small adjustment to the actual time the time overflows.

7.9 Macintosh* Time

Macintosh computers use an epoch time similar to Unix Time the previously described in section 7.3. However, the epoch starts on 1904-January-01 and uses a full 32-bit unsigned value that will overflow on 2040-February-06 at 06:28:16. This happens to be a few years after the UNIX signed 32-bit overflow date. People doing 30-year mortgage payment schedules should beware of this starting in 2010 (13 years from this writing).

Macintosh systems can be set to use the ISO 8601 date and time format. Please visit http://karchive.info.apple.com/article.html?orig=til&artnum=60753 for details.

Leap seconds (section 3.6.4) may make some small adjustment to the actual time the epoch overflows.

7.10 OS/2 Time

Anyone out there know? E-Mail to ghealton@exit109.com!

8. Bad Coding Techniques

8.1 Good Code

This section of bad code also contains some good code in showing corrections to bad code.

8.2 Ways To Abuse Good Formats

While the native UNIX and C time formats are effective and fully Year 2000 compliant, not every programmer using them is properly "date aware". No matter which languages are selected, it is always easier to write a bad program than a good program.

8.2.1 Errors In Data Types

  1. Assuming "time_t" values are int:
    int TheTime = time( (time_t *)NULL );/* not portable! */
    

    This works under most current UNIX systems where int and long are the same. But if the program is "fixed" by moving it to a 64-bit system, it will still fail in 2038 if int remains 31-bits while time_t (and long) grows to 63-bits.

    Programs using "int" for time will immediately fail when migrated to compilers defining int's as 16 bits.

    Using ints for variables that contain times makes it much more difficult to track down all statements impacted by date calculations during repairs or enhancements to their logic.

  2. Assuming "time_t" values are long:

    Not as many immediate portability problems as int, but this is still pasting a "kick me" sign on your back. Especially now that ANSI/ISO C 99 (C99) supports long long data types for 64-bits, leaving int and long for 32 bits. If time_t internally becomes long long programs assuming it is just long will be in trouble.

  3. Printing "time_t" values via printf() without casts:

    When printing time_t values on printf() type statements (or other variable argument list statements), the value should be explicitly cast to some well-known data type if a time_t variable is not expected.

    printf( "time_t TheTime=%llu\n", (unsigned long long)TheTime );

  4. Assuming time_t values contain seconds:

    The only way, in a portable manner, to obtain the interval, in seconds, between two times in the C language is to use the difftime() function.

    Unknown = Time1 - Time2;
    Seconds = difftime( Time1, Time2 );

    For other languages study the definition of the time functions very carefully. If needed, find the corresponding function in that language.

  5. Assuming time_t values start at some fixed date.

    While starting at 1970-January-01 is popular for UNIX, other operating systems are free to choose different base times. (And they do!)

8.2.2 Errors In Date Calculations

  1. Using Modulo Years: [Y2K]

    Calculating struct tm years by using modulo 100 rather than subtracting 1900.

    
    x.tm_year = Year4 % 100;  /* popular, but sad, method */

  2. Dividing Or Moduloing time_t Values: Any division or modulo of time_t values needs to be studied with great care to determine if it has Y2K, or other date bugs, in it. Attempts to calculate some type of crude elapsed time or delay usually work well, but logic expecting exact day or year counts is suspect.
    
    Time = time( (time_t *)NULL );
    AboutSixMonthsAgo = Time - ( 6 * 30 * 24 * 60 * 60 );
    AboutSixDaysAgo = Time - ( 6 * 24 * 60 * 60 );
    /* above may be safe for crude approximations */
    
    Years = ( Time2 - Time1 ) / ( 365 * 24 * 60 * 60 );
    /* very likely a problem, especially if you need to
    catch differences in years. does not allow for
    time zones (OK in UTC time), and does not allow for
    leap years, nor leap seconds. */
    
    { /* one accurate way to calculate year offsets follows */
    struct tm  StructTm1, StructTm2;
    StructTm1 = *localtime( &Time1 );
    StructTm2 = *localtime( &Time2 );
    Years = StructTm2.tm_year - StructTm1.tm_year;
    }

  3. Errors In Pivot Years: [Y2K]

    "Pivot Years" allow two-digit years to be converted to four-digit years. The standard pivot year for UNIX is 68 (e.g., 68/69). 69-99 are assumed to be 1969 through 1999 while 00-68 are assumed to be 2000 through 2068.

    This author prefers to provide two values for pivot years (68/69) rather than just a single number (68) to avoid confusion on exactly where the boundary is.

    Mistake: Different pivot years in different programs.
    Repair: The same pivot year should be used in all programs.

    Mistake: Pivot years of 69/70 must not be used in most UNIX or C environments. Due to time-zone differences, the Americans, and other areas, actually see years of 69 for small values of time_t on 1970-January-01 where the time_t value is less than the offset from UTC time (e.g., before 5AM on EST time). Since these areas have a negative offset from UTC, their early values are the day before UTC time: 1969-December-31. Programs using 69/70 for pivot years incorrectly translate 1969 into 2069 for these early times.
    Repair: Use pivot years of 68/69.

  4. Assuming Epoch is 1970 The common epoch of 1970 used on UNIX systems is not true for other operating systems (the entire world is not UNIX).
    Repair: Use #include "tzfile.h" to obtain a definition for EPOCH_YEAR (full four-digit year). If your local system does not supply tzfile.h, see section 10.4, Leap Years.

  5. Errors In tm_year: [Y2K]

    The standard C method for tracking year values is the tm_year member of struct tm. The value of tm_year is kept in "Year-1900". Many people assume these are two-digit values that will become zero in the year 2000, but this is not true. Starting in the year 2000 the value will use the three digit value 100, year 2001 will use 101, etc.

    The following is a popular perl bug. Creative programmers in other languages have migrated this bug to languages where it is difficult to express it.

    @LocalTime = localtime time;#obtain local time
    $Year = "19" . $LocalTime[5] if $LocalTime[5] >= 70;
    $Year = "20" . $LocalTime[5] if $LocalTime[5] <  70;

    This not only has the pivot year bug previously described (69/70), it will also yield the five-digit year "19100" for the year 2000. Even if it did work, you would get 200 through 209 for the years 2000 through 2009. The correct fix is to use:

    $Year = $LocalTime[5] + 1900;

    Perl is such that you can use the numerically calculated value anyplace a character string is needed.

    Also see 8.2.4, Errors In Printing Years for common problems relating to printing tm_year values.

    [HOT TIP] Incorrect usage of tm_year values are a major source of errors!

8.2.3 Errors In Reading Dates

  1. Believing Two-Digit Years: [Y2K]

    When receiving a two-digit year, simply using the date without special checks is wrong.

    Programs reading two-digit years need to add "100" to the year if it is too small. A well-used pivot date is 1968:

    
    #define PIVOT_YEAR  68 /* window year (test <= this) */
    int year2;   /* current year as value of 00-199 */
    
    if ( year2 <= PIVOT_YEAR )
    year2 += 100;

    This makes a value that may be used with values in the "tm" structure for years of 1969 through 2068. Whatever pivot date you use, be sure all programs agree on it.

    More on pivot years can be found in section 10.8.

  2. Incorrect Adjustment Of Four-Digit Dates: [Y2K]

    When reading four-digit dates relating to struct tm values, be sure to SUBTRACT 1900 from the date rather than taking modulo 100 (remainder) of it before setting values in "tm" structures.

    
    Wrong.tm_year = Year4 % 100;/* VERY BAD */
    Good.tm_year = Year4 - 1900;/* correct tm_year */

8.2.4 Errors In Printing Years [Y2K]

  1. Programs Not Expecting time_t Years Greater Than 99

    This code will fail in the year 2000 when it prints "19100".

    void WrongDate ( struct tm *year ) {
    printf("The date is 19%d-%2d-%d",
    year->tm_year, year->tm_mon, year->tm_mday );
    }
    

    If it was just printing "The date is %d-..." it would print a year of "100-...", yet another way to go wrong.

    A correct way to print two-digit tm_year values follows:

    
    printf( "%02d/%02d/%02d\n",
    x.tm_mon+1, x.tm_mday, x.tm_year%100 );

  2. Improper Zero Suppression

    Programs expecting to "correct" the previous problem by only printing the last-two digits of "tm" years must not overlook printing leading zeros for the years 2000 to 2009.

    
    printf("The date is %2d/%2d/%2d",
    year->tm_mon, year->tm_mday, year->tm_year % 100 );
    
    

    While numerically valid, single-digit years may not be acceptable to customers, especially if just %d is used to format the year, which will format a single digit for the years 2000 through 2009.

    Using "%02d" will properly print two-digit years. ("%02d" is also suggested for month and day.)

    NOTE: "19%d" would print "190" to "199" for these ten problem years.

  3. Print tm_year Years As Non-Numeric Values [Y2K]

    Programs that format years themselves rather than relying on printf type formatting may produce non-numeric characters starting in the year 2000.

    
    char TheDate[9] = "mm/dd/yy";/* formatted year */
    
    TheDate[7] = year->tm_year % 10 + '0';  /* isolate 1s digit */
    TheDate[6] = year->tm_year / 10 + '0';  /* isolate 10s digit */
    

    Once year is greater than 99 TheDate[6] will contain non-numeric values (TheDate[6] = 100 / 10 + '0' is 10 + '0', or ':' for a year of ":0"). Windows 3.x, and some early 95/NT programs, pull this trick in places.

  4. Errors In Calculating Previous Dates [Y2K]

    1. Failure to allow for underflow on two-digit years:
      Year--;

      This logic would happily return -1 rather than 99 when backing up from year 2000 on programs that use two-digit years (00 for 2000).

    2. Failure to back up to 2000-February-29 from 2000-March-01. This has been a particular problem for many programs. "cfront" being one such program, which may impact C++ programs yesterday() functions.

8.3 Leap Year Bugs [Y2K]

8.3.1 Proper Technique [Y2K]

The proper way to determine if a year is or is not a leap year requires three different tests:

  1. If the year is evenly divisible by 4, it is typically a leap year unless,

  2. The year is evenly divisible by 100, which are not leap years, unless,

  3. The year is evenly divisible by 400, which remain leap years.

Additional information is in section 10.4 of Good Coding Techniques.

8.3.2 Bad Tests [Y2K]

Programs that only use the first test will work for years from 1901 to 2099 (and are technically not having Y2K errors). Programs that just use the first two tests will fail in the year 2000.

The C language, and other languages using year-1900 formats, are open to the bug of dividing year-1900 values by 400 to test for leap years.


LeapSw = !(x.tm_year % 400)  ||
( !(x.tm_year % 4) && (x.tm_year % 100) );

This is a serious bug as you need to add 1900 to get proper results:


LeapSw = !((x.tm_year+1900) % 400)  ||  

Typical failure symptoms include calculating the wrong day of week on and after 2000-03-01 (Monday, Tuesday, etc.) and calculations advancing over February-29 being short by one day;

8.3.3 Leap Year Legends

Some urban legends have a fourth test as well based on the fact that the average tropical year is about 365.242199 days (depending on which reference book you look at) while the Gregorian leap year calculations use 365.2425 (exactly) for an error of about 0.000301 days per year.

Other legends call for a double leap year in the future.

While a correction will be needed sometime in the distant future, the current rules are so accurate, and all of the earth's movements sufficiently chaotic, the actual year this correction will be needed can not be predicted with sufficient accuracy. The popular guess so far range between the year 3400 and 4300. By this time mankind may decide it's easier to adjust the earth's orbit than all the calendar calculation programs.

8.4 Length Of Day Bugs

Making errors when calculating times that cross minute, hour, and especially day boundaries is a very popular pastime in nearly every computer language. All it takes is a programmer overlooking several obvious or subtle features of time. The more common ways of making mistakes is discussed in this section in ways that apply to all programming languages. The length of minutes, hours, and days are also problematic when crossing midnight, times spanning local savings time changes, and times involving Leap Seconds. Any error in either calculating the next (or previous) day or crossing midnight can be loosely called a "length of day" bug. Many errors in calculating lengths of minutes and hours also adversely impact length of day calculations.

Programs internally using UTC time without dates tend to avoid most of these problems. Programs that use dates or local times tend to have more problems. Problems due to Leap Seconds tend to range over all classes of programs. Programs that calculate specific ending times to a hour, minute, and second in the future should be considered guilty until proven innocent. Bugs due to improperly processing local savings time tend to be particularly obnoxious as developers can easily overlook them or, not properly understanding savings time, insist their buggy code is correct ("it works for me!"). Sometimes these bugs may lurk inside a program for years before causing problems. Bugs showing times that are an hour off tend to be easy to find. Bugs silently miscalculating times that result in spoiled output may present no clue to what caused the failure.

Attempting to change to the next day, hour, or minute by adding a constant to the current time, especially local times, is a near universal cause for these bugs. Such calculations do not always change to the expected time. Other developers simply to not realize how provincial savings time rules are and apply their local rules to the rest of the world, or even their own nation. Operating systems that do not follow leap seconds diverge from the true time when a Leap Second strikes. Operating systems that do follow leap seconds can cause problems for applications that do not.

Using UTC time can greatly reduce these problems as UTC time is not subject to savings time changes. While leap second changes remain, these occur much less often and may only span one second. More information on leap seconds, is found in section 3.6.4.

The most popular ways to go wrong include:

Adding 60 seconds to the current time should be considered "a minute from now" and not the "next minute". The same applies to hours and days.

Adding a constant needs to be considered an "approximate" advancement in time. Calculating an "interval" rather than an "ending time".

When there is a need to advance across local dates, with no real need for time, using a base of 12:00 (local noon) for calculations may help. Do not use midnight (00:00) or the current local time. This should help in avoiding errors due to savings time and leap second changes.

time_t    Time, Midnight, Noon, Tomorrow;
struct tm t;
time( &Time );/* fetch current epoch time */
t = localtime();/* break down local time */
Midnight = /* calculate midnight in epoch time */
Time - (((t.tm_hour*60)+t.tm_min)*60+t.tm_sec);
/* (ignore savings time changes) */
Noon = Midnight + 12*60*60;/* noon, epoch time */
Tomorrow += Noon + 24*60*60;/* time FALLING in tomorrow */
/* (not ALWAYS noon) */
If your use of date is sufficiently casual that you don't care if it is off once in a while, then document the fact and just use the traditional
Tomorrow += UtcTime + 24*60*60;  // calculate next day, ignoring
////leap seconds

8.5 Makefile and Distribution Bugs [Y2K]

Complex applications often have programs not part of the main application suite to to help build or distribute the applications. If these support programs have problems due to date bugs the makes may fail, deliver the wrong software, or have other problems. These bugs rarely occur in the makefile itself, but in the supporting programs.

Programs involving only simple makefiles, with simple targets, dependency rules, and make steps, are most unlikely to have this problem.

Exception: the underlying SCCS, RCS, or other source control programs, must be Y2K compliant for make to work. If you have any doubt at all, check any critical programs to be sure they are compliant.

If these support programs have Y2K bugs in them it may not be possible to build or distribute the application after 2000-January-01 until these bugs are repaired. The application itself may not have any Y2K bugs in it, working and testing fine after 2000-January-01. Thus people testing only applications built in the 199x time period may be missing something critical to successful Year 2000 operations.

On large or complex applications whose build tools use dates, part of the Y2K test should be advancing the local clock to various dates in year 2000 and attempting to rebuild the application in the test years and passing the program though any automatic cataloging or distribution processes. Without this test you may not be able to repair or distribute any Y2K bugs overlooked in the program until after you chase down the make bugs.

The following makefile support program has two such Y2K bugs in it. Study the program before reading the description of the problem to see if you can find these bugs.

BuildNumber.c - return build number for program

A build number is used to uniquely identify a particular build. This function encodes the build number as a magic number (50) followed by a program generated date and hour within that date.

#include <stdio.h>
#include <time.h>
char * BuildNumber(void) {
struct tm LocalTime;
char static Number[10+1];
time_t Time;
Time = time((time_t*)NULL);
LocalTime = *localtime( &Time );
printf( "50%02d%02d%02d%02d",
LocalTime.tm_year, LocalTime.tm_mon+1,
LocalTime.tm_mday, LocalTime.tm_hour );
return Number;
}

FAILURE: This program will produce 50100010100 for 2000-01-01. Automated archive or distribution programs sorting build numbers by character values will suddenly think this is older than programs with build numbers of 5099123123 made moments before and happily redistribute the last program built in 1999 rather than the last program build in year 2000.

FAILURE: Number[11] is overflowed by one character as the year 2000 result is now 11 digits plus the terminating '\0' (null byte), 12 digits total length. This can result in segment faults, truncated strings, or other undefined behavior, that completely prevents building new programs.

FIX: Change the strftime() call to put out a three digit date that still keeps a 10-digit number.


strftime( "50%02d%02d%02d%02d",   original bad line
strftime(  "5%03d%02d%02d%02d",   repaired line

8.6 Troff/Nroff Bugs [Y2K]

The numeric register \n(yr in the *roff family of programs may store year values in the form year-1900 or year%100, depending on implementation.

Many *roff programs simply use 19\n(yr to print the current year. This must be changed to match the good code described in section 10.2.4, Troff/Nroff Macros as it tends to format as 19100 or 190 on most versions of *roff in the year 2000. If \n(yr is used without a leading 19 it tends to format as 100 or 0 (single digit!).

Y2K bugs in the original *roff macros are legion.

8.7 Magic Dates [Y2K]

Back in the dim dark days of the 1950's and 1960's, when computers were large and slow, dates like 9/9/99 and 12/31/99 were so far in the future it boggled the mind. Thus, when programmers needed to signal special conditions in records containing dates, they reserved magic dates, such as 09/09/09, 12/31/99, or 99365 and 99099 (Ordinal days), to trigger the special conditions. These conditions included "end of file" conditions, undefined date, date not yet entered, deleted date record, keep forever, to name just a few. Using valid dates allowed the magic values to slip through any date validation code that prevented truly invalid dates like 00/00/00 or 99/99/99 from indicating the special conditions. Sadly magic test years have been used in software into the very late 1990's.

The difficulty is that these dates are real dates coming soon to computers near us. Programs using these dates are going to do some very strange things when it encounters these dates for real. In practice such programs must use values that are not valid dates, such as a reserved alphabetic name, 00/00/00, or 99/99/99 (non-zero values suggested to avoid accidental blanks from triggering the condition).

Once the program is repaired to use some other magic value very thorough testing is needed using the former magic dates, testing transactions that start and end of this date, to be sure all logic that tested for these dates has been consistently repaired.

8.8 1999-August-22 GPS Overflow

The Global Positioning System allows positions to be calculated by having satellites broadcast very accurate times of day to the GPS receivers. Once a receiver has three or more of these times it performs a bunch of fierce mathematical calculations on the subtle differences in the received times to triangulate the receivers location. Four times are required to calculate elevation. Because the time is constantly broadcast, and is so accurate, more and more systems that log transaction times, or otherwise require good times, are using special "time only" GPS receivers that computers can read time from.

Computer systems relying on GPS for time of days must allow for various idiosyncrasies of the GPS time system. In short the time used by Global Position Systems is slowly drifting from UTC time. During year 2000 GPS system time was 13 seconds ahead of UTC time. This difference will change with each leap second adjustment applied to UTC. Any computers obtaining time of day from GPS receivers must provide the appropriate corrections to obtain a proper UTC time. The GPS receiver must also properly handle GPS End Of Week Rollover conditions where the internal cycle used by GPS resets to zero.

For those who want it, a longer and more technical description follows.

GPS receivers use a 1,024 week (7,168 day, slightly over 19.5 year) cycle where the first cycle started on 1980-January-06 00:00:00 UTC. GPS System Time started when UTC time was 19 seconds ahead of TAI time. International Atomic Time (TAI) started in 1958 and keeps pure atomic time without any "perceivable step adjustments" (e.g., leap second corrections). To simplify internal GPS operations GPS time also omits leap second corrections. Therefore GPS system time will slowly drift from UTC time but always remain 13 seconds ahead of TAI time. During the years 1999 and 2000 UTC was 32 seconds ahead of TAI time. To calcualte UTC time from GPS System Time the leap second correction in force at the moment must be applied.

On 1999-August-22 at 23:59:47 UTC (24:00:00 GPS system time) the original GPS cycle reached its end and returned to week zero. Non-compliant receivers produced incorrect position reports as they reverted to the original start date of 1980-January-06. This condition is known as End Of Week (or EOW) or End Of Week Rollover in the GPS community. This problem tended to be limited to older receivers manufactured before 1995. Most modern receivers successfully coped with this roll-over, though some needed a software update to do so. Receivers impacted by this problem that did not accept firmware upgrades had to be discarded.

The next GPS roll over occurs on the weekend of 2019-April-07 Sun. Visit http://www.navcen.uscg.gov/g ps/geninfo/y2k/ for more information from the United States Coast Guard web site on GPS.

8.9 Microsoft April-01 DST Bug

Windows 95, 98, and NT 4,? will delay a week in the United States (at least) for returning to daylight-saving time on years where 01-April falls on a Sunday. Instead of returning to DST on April-01 the MSVCRT.DLL may return to DST on 2001-April-08. The problem first occurred in the year 2001. Once the DLL is replaced with a corrected version some applications may need to be recompiled. The problem appears to be fixed in Windows 2000.

Article Q214661, FIX: Daylight Savings Time Bug in C Run-Time Library" in Microsoft's KnowledgeBase discusses this problem, and provides links to where programmers can get Service Pack 3 or later for Visual Studio 6.0. A fix for regular users is available via an older standard Windows Update that everyone should have installed by now.

8.10 Shell Script Bugs [Y2K]

  1. Shell scripts often favor obtaining dates using `date +%y` rather than `date +%Y`.

  2. Another way to go wrong with a shell is to obtain different parts of a date in different calls to `date`.
    
    Month=`date +%m`;
    Day=`date +%d`;
    Year2=`date +%y`;
    

    If this logic happened to execute on different days you could get very strange results. The correct logic is to use the read statement to isolate all fields from a single date execution:

    
    date '+%m %d %Y %H %M %S' | read Month Day Year4 HH MM SS
    -- or --
    eval `date "+Month=%m; Day=%d; Year4=%Y; HH=%H; MM=%M; SS=%S"`
    

  3. Calculating with two-digit %y years may produce one-digit values..
    
    LastYear=`expr \( $Year2 - 1 \)`;
    if [ $LastYear -lt 0 ]; then $LastYear=99; fi
    LogFile="log/MyLog.$LastYear$Month$Day";#2001-2010 bug
    

  4. Calculating with two-digit %y years may produce three-digit values..
    
    NextYear=`expr \( $Year2 + 1 \)`;
    LogFile="log/MyLog.$NextYear$Month$Day";#2001+2009 bug
    

    Note that this is not a true year-1900 value, but simply a wrong value.

8.11 JavaScript Bugs

8.11.1 JavaScript General Bugs

JavaScript programmers dearly love to go wrong by

Also see http://www.merlyn.demon.co.uk/js-dates.htm#SDB.

8.11.2 JavaScript Bonus Bugs [Y2K]

The year returned by JavaScript getYear() method is extremely problematic. For some browsers years 1900 to 1999 return as values of 0 to 99 while all other years return a full four-digit value. Thus some programs will find the year "99" is followed by the year "2000". Other browsers always return four digit years. This is very much dependent on the browser being used at the moment.

JavaScript programs that do not convert years of 0 to 99 to 1900 to 1999 will fail at the stroke of a particular midnight. Code to cope with this bug follows:

YearThis = now.getYear();//get the year
if ( YearThis < 500 )//allow for 1900-2399 to return 0-499.
YearThis += 1900;//

In fear that some (yet to be discovered) implementations of JavaScript will always return the years as year-1900, JavaScript programs written by this author break on the year 500 rather than the year 100. It would be easy for a JavaScript implemented in C or C++ to do this: simply returning the native year in "tm" structures will do it very nicely. This mistake is frighteningly easy to make.

An interesting result of this particular is that most year 2000 count down programs in JavaScript are not year 2000 compliant. Many always expect the year to be just like it is in C: values of year-1900 with the year 2000 being 100 rather than the 2000 it will be. During the early days of the year 2000, if you see Y2K pages claiming there are "1900 years left until the year 2000", you will know what happened.

Java itself appears to have implemented getYear() correctly, consistently returning four-digit years.

8.12 System Bugs

[Y2K]Some systems do not allow for leap seconds, including values in time_t, while some systems do. Consult your local documentation. If your environment does not support leap seconds, allowing for future releases of the OS to support leap seconds is wise.

Some systems improperly calculate values of time_t. Thus values of time_t may not be portable across all systems or compilers. This is generally only a problem if the local operating system does not track time in time_t units making it necessary for the language library functions to do the translation. Differences to thousands of seconds have been observed by this author.

Naturally operating systems can have most of the other time bugs in this document.

9. Testing Applications By Changing CPU Dates

9.1 Introduction

When testing date logic of programs, especially for testing Year 2000 related repairs, it is common practice to change the CPU's clock to test multiple dates. The best dates to test are those the application has a chance of breaking under. Traditional dates include 1999-12-31, 2000-01-01, 2000-02-30, 2000-03-01, and a date in the year 2010.

Instead of changing the CPU's clock, sometimes you can run special applications that allow you to leave the CPU's actual time alone, while making the application under test believe it is running in a different time rather than the current true time. The best of these applications allow testers to control the speed of the clock, faster or slower than real time. Another sign of a good application is one that adjusts the timestamp of files, etc., so the system sees the current true time but the application sees the advanced time. This reduces the need to artificially age file and other timestamps.

Changing the environment variable TZ, on systems supporting TZ, has proven itself insufficient for serious testing. Too much code is happy to ignore TZ. In C, TZ does not impact values in time_t variables, just local times.

9.2 Hazards Of Changing The Time

Changing the date applications run under is not without it's own hazards. Before advancing the application's time for testing, be sure nothing is present to rise up and bite you. Some of these issues include, but are not limited to,

  1. On PC systems, check that the BIOS and real time clock actually accept the year 2000 (most do, though they don't always smoothly roll over from 1999 into 2000).
    NOTE: the fact that the BIOS does not smoothly roll over into 2000 should not affect any tests you are running as the operating system's clock is likely to roll over correctly.

     
  2. Don't advance PC systems BIOS into the year 2000 or beyond unless you're willing to take a slight risk that the hardware will stop working until a technical type removes battery power and lets the system sit a few days, maybe replacing a few small things. This author has heard of systems this trick will not work under though he personally has never met a system he had to try playing battery games with. Most PC's do not have any problem returning to the 1900's from the year 2000. No guarantees though.
     
  3. Do not try to set the timer of a PC at $40:$6C above $1800AF.
     
  4. On more general workstation systems, be sure the boot PROM can work with the year 2000 and beyond.
     
  5. Do not try to set the date for UNIX based systems into or beyond the year 2038. Attempts to do so may require very serious hardware work on your system to get it to boot again. The last person this author knows who set an older Sun system to the year 2038 had to open up the CPU, play hardware games, then completely reformat and rebuild all file systems. It took over two days to figure out all recovery techniques.
     
  6. Advancing the time may make any license managers believe the license has expired, making it impossible to run your tests. Consult vendors of software using license managers to see if you can not only do your tests, but will be able to return to production once the time returns to normal.
     
  7. If you test on your live production system, automated clean up procedures may decide to delete older information. If you advance the date enough, older information is your current live production data. Losing your current outstanding Accounts Receivable or Customer Order files is not a nice idea.
     
  8. Be prepared for strange timeouts as you advance the time. Moving the CPU's clock up a year in the middle second of a five second timeout can instantly make an interval of one-year and two seconds, resulting in an error recovery timeout. Example: if you are directly dialed into the computer system by a modem, be prepared for the modem software to hang up on you.
     
  9. Batch jobs running at scheduled intervals (e.g., UNIX cron jobs) may suddenly run all at once when you advance the current time. This has not been a problem for this author, except occasionally waiting for the crush of jobs to complete before starting new tests. Disable the scheduling of jobs that might cause problems if you can. If you can't, be sure database updates, etc., don't corrupt your files.
     
  10. Booting the system when the clock is far advanced in time is not advised unless you know it will not mess up your BIOS, boot PROM, whatever your local system calls it.

9.3 Before Changing The Time

Backup everything in sight to be sure you can recover from any data loss that may strike as a result of the tests.

The ideal test bed is a sacrificial system that is a clone of your production system, with full backups beforehand and a total reinstall later. Best of all, though hardest to set up, is a system isolated from your production networks. Sadly this is not practical for many people.

It may be necessary to artificially age dates in databases to match the time you are running the test in. Such data aging may best be done after the date is changed but before the tests are run.

If testing multiple future dates it is best to cycle through the test dates in ascending order.

9.4 After Changing The Time

After all of your test runs are complete, reboot the system after returning to the current time if you changed the CPU's actual time. Most operating systems take critical actions at regular intervals. Once the clock is set into the future the system may stop performing critical tasks until the clock advances to the older time. Even if the system appears to work fine there may be problems brewing out of sight.

Computer operators often see such freezes when they correct the CPU's clock by 10 seconds or so if the clock becomes adrift of true time. If you just backed up the CPU time a year or more it is a long wait without the reboot. Flushing the disk cache, updating graphic widow displays, batch jobs running at scheduled intervals (e.g., UNIX cron jobs), the list of potential problem areas is very long,.

After you run with the local time set to the future you may wind up with lots of files date-touched, with these dates. While simply rebooting the system with the correct date and time allows the operating system to recover (usually!), applications depending on file time stamps or dates written within files and databases may become confused by the future dates. It is often easier to just scrub the system and reinstall once all tests are complete than to track down the future dates and revert them.

10. Good Coding Techniques

10.1 Also See

Some good "corrective code" was previously shown in section 8, Bad Coding Techniques.

10.2 Data Types

10.2.1 Use Native Types

Always use the native data type for the language to hold date and time values. For C time_t holds numeric values suitable for comparing and calculating many times. (NOTE: some systems are starting to use 64-bit time_t values rather than the currently popular 32-bit time_t values).

Always use any built-in translation functions to map between different types of time data.

10.2.2 C Language Types

The following items are defined in the ANSI and ISO standards for C and should be available on all compilers. The use of other time functions may make it more difficult to compile the software on other systems.

10.2.3 Perl Types

True to its heritage, perl programs typically keep time_t times in a simple numeric variable as perl doesn't really have type definitions. Broken down time is kept in a simple array, each member of the array corresponds to a member of the struct tm structure in C.

10.2.4 Troff/Nroff Macros

10.2.4.1 Introduction

The *roff family of programs makes the current date available to programs using the three numeric registers yr, mo, and dy for year, month, and day. Month and day use the traditional values starting at one. However, the value for year should be considered inconsistent. Year values may be in the form "year-1900" or "year%100", depending on implementation. Portable *roff documents should cope with both values. Even better is for authors to also allow for full four digit years in future versions of *roff.

The *roff family of programs traditionally maintain two independent symbol tables... one for "numeric registers" and another for "string registers" and macro definitions. The same symbol name may be used for a string or macro without conflicting with an existing numeric register. The date solutions proposed herein define new string register names that match, or are close to, the original numeric register names used for dates.

It is assumed that the serious reader knows the basic rules of *roff. *roff programs very strange rules to define and reference symbols not discussed herein. Two rules to be particularly aware of here is that the names are case sensitive and traditionally limited to two (2) characters in length.

10.2.4.2 Four-digit Years

Formatting four-digit years allows numeric registers to continue to hold a year value. The traditional yr register can not be used for this as it is reserved and changing it produces "undefined" actions on *roff programs. Even if it works for now, it may break in a different release of *roff.

As the native yr numeric register may contain year%100 or year-100 style years, the following code sets the (new) numeric register Yr to a full four-digit year using the current year found in the built-in numeric register yr.


.nr Yr \n(yr\"                 set to "short" year
.if \n(Yr<69 .nr \n(Yr+100\"   apply official UNIX pivot year
.if \n(Yr<500 .nr \n(Yr+1900\" ensure full four-digits

See section 10.8.2, Fixed Pivot Years, for a description of this logic.

10.2.4.3 Two-digit Years

Two-digit years should not continue to be stored in numeric registers as that will format years as a single digit number for the years 2000 through 2009. (Any testing of *roff pages should include formatting both years 200x and 20xx to check the switch between one and two digits.)

The following logic converts the numeric date registers into appropriate two-digit strings of the same name


.nr Yr \n(yr%100\"       ensure two-digit numeric register
.ds Yr \n(Yr\"           convert to string
.if \n(Yr<10 .ds Yr 0\n(Yr\"  0-9 becomes 00-09
.ds mo \n(mo
.if \n(mo<10 .ds mo 0\n(mo
.ds dy \n(dy
.if \n(dy<10 .ds dy 0\n(dy

Use these string values to obtain two-digit date fields.

To obtain one-digit month and days, with leading spaces, use:


.if \n(mo<10 .ds mo \0\n(mo
.if \n(dy<10 .ds dy \0\n(dy

Omit these "if" statements if you want one-digit month and days for values of 1 through 9. The current trend appears to always format years using two-digits.

10.2.4.4 ISO 8601 Dates

To obtain dates in ISO 8601 format, combine the previous Yr logic of four-digit years and the mo and dy logic of two-digit dates with the following statement:


.ds dt \*(Yr-\*(mo-\*(dy\"ISO 8601 date

Testing the date would use something like the following:


.br
Today's Date is \*(dt
.br

10.3 UNIX Data Types

The following functions are available on most UNIX systems. gettimeofday() returns current time, with potential resolution to the microsecond. "Potential", as this often exceeds available practical accuracy. See local system manuals for details.

ftime() returns current time, with a potential resolution of up to one-thousandth of a second (millisecond), along with current time zone and daylight saving time information.

10.4 Leap Years

The proper way to determine if a year is or is not a leap year requires three different tests:

  1. If the year is evenly divisible by 4, it is typically a leap year unless,

  2. The year is evenly divisible by 100, which are not leap years, unless,

  3. The year is evenly divisible by 400, which remain leap years.

A correct C expression for determining if a year is a leap year or not follows:

LeapSw = !(Year % 4)  &&  ( (Year % 100)  ||  !(Year % 400) );

C Suggestion: Many UNIX systems provide a header file named tzfile.h that provides many definitions useful to time processing. If your system does not provide a tzfile.h you can download a compressed file from:

http://sunsite.doc.ic.ac.uk/public/pub/public/unix/4.3bsd-reno/include/

or create your own with just the following required definitions, changing any values as appropriate to your system:


/* tzfile.h -- local subset of regular tzfile.h */

/* determine if leap year:
(valid for current Gregorian Calendar) */
#define isleap(Year) ( \
!((Year) % 4) && (((Year) % 100) || !((Year) % 400)) \
)

#define EPOCH_YEAR1970

/* end: tzfile.h */

10.5 Ordinal Dates

Ordinal Dates, popularly but incorrectly called Julian Dates (or Julian Calendar), provide a year and the number of days into that year. 2000-January-01 is 2000-001. 1999-December-31 is 1999-365.

To translate between standard Gregorian Dates and Ordinal Dates the following code fragment can provide a helpful model:


short OrdinalDays[] =  {
/* days (0-364 or 0-365) BEFORE each month */
0,31,59,90,120,151,181,212,243,273,304,334,365,
0,31,60,91,121,152,182,213,244,274,305,335,366
};
short LeapSw;/* 0 if not a leap year, 13 if leap year */

LeapSw = ( Year%4 || (!(Year%100) && Year%400)) ? 0 : 13;

To translate a Gregorian Month (1-12) and Day (1-31), into an Ordinal day number (0-364 or 0-365) for a given Year:

Ordinal = OrdinalDays[Month-1+LeapSw] + Day-1;

Note how LeapSw directs the array references to the first or last half of OrdinalDays where the first half is for normal years and the second half is for leap years.

To determine the number of days in the current Year:

DaysInYear = OrdinalDays[12+LeapSw];

Note that the OrdinalDays table has 13 month values in it for each type of year just to allow this subscript to work.

To determine the number of days in the current Gregorian Month:

DaysInMonth = OrdinalDays[Month+LeapSw]
- OrdinalDays[Month-1+LeapSw];

10.6 Julian Days

Julian Days (see section 4.4) track the number of days since 4713-January-01 BC. Traditional Julian days start a new day at noon.

Chronological Julian Days follow Julian Days but with a half-day offset to follow the current Gregorian convention of starting a new day at midnight.

Modified Julian Days start counting at midnight 1858-November-17 and include a 0.5 day offset from traditional Julian Days to align them the current convention of starting a day at midnight.

Designing logic for date intensive programs may be much easier if the dates are stored and manipulated using some form of Chronological Julian Days. Problems associated with Y2K, 2038, leap years, and many others, all go away with Julian Days.

It is surprisingly easy to convert between the two if you keep to modern dates ignoring historic calendar corrections and other trivia. The following C code provides such a simple-minded example.


/* JULIAN_OFFSET: Julian day for either year 0 (1BC)
*  if normal Julian Days are wanted, or offset needed
*  to make 1858-11-17 day zero for Modified Julian Days.
*  Pick the appropriate definition for your use. */
#define JULIAN_OFFSET  1721059 /* use full Julian Days */
#define JULIAN_OFFSET (1721059-2400000) /* use MJD */
typedef  long  jday;

jday Tm2Julian(
struct tm  *Tm )/* incoming year */
/* (valid year range 1752 through far future) */
{
jday  JulDay;
int   Year;/* full four-digit year */
int   Year1;/* Year, less 1 */
Year = Tm->tm_year + 1900; /* calculate 4-digit yr */
Year1 = Year - 1;     /* year, less 1 */
JulDay = Year * 365;       /* approximate day cnt */
JulDay += (Year+3) / 4;    /* add in leap years */
JulDay -= Year1 / 100;     /* fix for 100 years */
JulDay += Year1 / 400;     /* fix for 400 years */
JulDay += Tm->tm_jday;     /* add in ordinal day */
JulDay += JULIAN_OFFSET;   /* final adjustment */
return JulDay;  /* return appropriate answer */
   }

A perl subroutine library is available at http://www.exit109.com/~ghealton/y2k/julians.pl that converts between Julian Days, Ordinal Dates, and standard Gregorian calendar dates.

Picking a non-traditional epoch to start counting days is a popular way to obtain smaller day number for current dates.

10.7 Integer Days

A simplified version of Julian Days, that this author calls "integer days", allows up to 176 years to be packed into a 16-bit unsigned value. Encoding and decoding the actual year, month, and day is very direct and simple. This speed is at the expense of not being able to use these dates in calendar calculations. Subtracting two dates does not tell you the number of days between the two dates, though it does tell you the general relation between two dates.

The algorithm stores something close to the number of days that have occurred between the encoded date and a previously selected epoch (no dates may be before the epoch, even if IntDate is signed). The encoding is done under the assumption that each month has 31 days in it, resulting in a "year" of 372 days. To fit 365 days in a year this technique wastes 7 days a year (6 days during leap years).


#include <limits.h> /* get #define CHAR_BIT 8 */

typedef unsigned short IntDay;/* our data type */

#ifndef INT_DAY_EPOCH
#define INT_DAY_EPOCH  1970/* base year */
/* suggest close to 1970 if IntDay is unsigned short.
suggest 0 if IntDay is unsigned > short
suggest (-4713+1) if IntDay is signed long */
#endif
#define INT_DAY_DAYS  (31*12)/* maximum days in year */
#define INT_DAY_YMAX /* maximum years */ \
(((1U<<(sizeof(IntDay)*CHAR_BIT-1))/INT_DAY_DAYS)*2)
/* determine maximum value IntDay may hold,
* using power of two, via shift. Use CHAR_BIT-1
* to only use one-half of the total value
* to ensure we do NOT overflow and get a value
* of 0 by shifting the bit all the way out.
* Make up for this by the final *2, which
* doubles the previously halved value. */
/* Sample code to step through each year follows: */
/* for ( n = 0; n < INT_DAY_YMAX; ...) {*/
/*     int year = n + INT_DAY_EOPCH;  */

/* extract Gregorian year from an IntDay value */
#define IntDayYear(Day) \
( (Day) / ( 31 * 12 ) + INT_DAY_EPOCH )

/* IntDayEncode() - Gregorian Date To IntDay */
IntDay IntDayEncode(    /* returns compacted date */
int Year,       /* year to encode (e.g., 1998) */
int Month,      /* month to encode (1 thru 12) */
int Day )       /* day to encode (1 thru 31) */
{{
return (IntDay)(( (IntDay)( Year - INT_DAY_EPOCH
) * (IntDay)12 + Month-1 ) * 31 - 1 + Day );
}}

/* IntDayDecode() - IntDay to Gregorian Date */
void IntDayDecode(/* has no value on return */
IntDay Encoded, /* compacted date to expand */
int *Year,      /* not NULL: ptr to store year at */
int *Month,     /* not NULL: ptr to store month at*/
int *Day )      /* not NULL: ptr to store day at */
{{
if ( Day ) *Day = Encoded % 31 + 1;
Encoded /= 31;

if ( Month ) *Month = Encoded % 12 + 1;
Encoded /= 12;

if ( Year ) *Year = Encoded + INT_DAY_EPOCH;

return;
}}
Other strange ways of packing dates also exist, such as the one known as GYMD as found at http://www.gtbaddow4.freeserve.co.uk/.

10.8 Pivot Years

10.8.1 Introduction

A pivot year, also known as date windowing, takes a two-digit year and expands it to determine which century the year is in. Typically the year is converted to either a full four-digit year or into the year-1900 format, as appropriate to the application at hand.

The following types of pivot years exist:

  1. Fixed Pivot Years: the pivot year is a previously selected static value.

    Advantages: easy to code and debug. Reading static dates in databases are always consistent.

    Disadvantages: limited life span of logic. Life span can be increased by reading in pivot year at run time from a fixed location to ensure it can change in time or if the program is moved to a different environment with a different pivot year.

  2. Sliding Pivot Year: the pivot year is calculated by subtracting a constant from the current year.

    Advantages: longer life span for logic.

    Disadvantages: not suitable for all applications. Harder to code and debug.

  3. Closest Date: The program calculates three dates in the previous century, current century, and the next century. The value closest to the current date is used.

    Advantages: longest life span for logic.

    Disadvantages: not suitable for all applications. Hardest to code and debug. Requires more testing.

10.8.2 Fixed Pivot Years

Most programmers are using a constant pivot year to repair Y2K problems due to the speed simplicity of the code change.

The following logic accepts as input either pure two-digit years, year-1900 values, or full four-digit years. The result is four-digit years.


#define YEAR_PIVOT    68/* official UNIX pivot */
#define YEAR_BREAK   500/* value < AnyYearWeUse &&
value > AnyYearWeUse-1900 */
int FixYear( int YearWork )
{
if ( YearWork <= YEAR_PIVOT ) /* 21'st century? */
YearWork += 100;          /* yes: pivot year */
if ( YearWork < YEAR_BREAK )  /* four-digit year? */
YearWork += 1900;         /* no: make four-digit */
return( YearWork );           /* return 4-digit year */
}

10.9 Accuracy versus Performance

As time functions, no matter what the language, are generally system calls they often have more overhead to them than calls to simple conventional functions. It is often best to obtain the time at the start of a transaction (e.g., reading a record, opening a connection, starting some request) and using that time throughout the transaction. Especially if the transactions are short but frequent.

Remembering a single start time ensures all timestamps in different messages are synchronized with each other. Alternate time formats can be saved by calling functions like C's localtime() function, to convert the system time to other formats at the same time the system time is fetched.

Longer transactions, or time stamping log files or other places where exact time is critical can call for more frequent use of time functions. Fetching the system time within heavily executed inner loops should be avoided unless you truly need the time with such frequency.

Programs creating files may wish to use something like C's utime() function to set the timestamp of a file to exactly match a time appearing in an important log file to better assure users they have the correct file when looking at problems.

Section 8.10, Shell Script Bugs, also describes date related bugs found in shell scripts.

10.10 Coping With Untrusted Dates

If you received dates that may have incorrect century digits in them due to Y2K bugs in the generating program, the following logic may prove useful:


Year2 = UntrustedYear % 100;/* discard century digits */
Year4 = FixYear( Year2 );/* rebuild century digits */

This code divides the incoming year by 100. The remainder, a two-digit year, is then adjusted to become an appropriate four-digit year.

10.11 Interval Timers versus Time Of Day

When coding logic that needs to wait a specific elapsed time, check to see if your local system supports a feature generally called "interval timers". These can be used, often with great precision, to wait a specific amount of time. Local sleep() functions often use interval times.

Historically program delays have been performed by reading the current time, adding the desired delay to it, then waiting for that "wall clock time" to occur. This works fine proving the operator does not change the time of the system to correct for the system becoming adrift of true time at a critical moment. Testing date logic can also result in date changes.

See section 9.2, Hazards Of Changing The Time, for additional of problems associated with using wall clock times. In many cases you will be stuck with wall clock times, regardless of the problems they cause, but don't use it out of reflex when interval timers are reasonable alternatives.

If you determine the extra effort is warranted to cope with occasional time of day corrections from operators, the following code presents a model you can expand on.


/* code to wait until the file whose path is in FileName
* is created. */
#define DELAY   45/* delay before timeout */
int  DelayLeft;      /* failsafe time countdown */
time_t TimeLast;/* time loop expires */
time_t DelayLast;/* last observed time */
DelayLeft = DELAY;/* set failsafe counter */
DelayLast = (time_t)0; /* and set associated time */
TimeLast = time((time_t*)NULL) + DELAY; /* expire time */
while( time((time_t *)NULL) <= TimeLast )
{   /* keep waiting for the file to appear */
if ( stat( FileName, &StatBuff ) == 0 )
{   /* file exists */
break;/* done! */
}
/* (the main concept this section is demonstrating is
in the following if... */
if ( DelayLast != time((time_t *)NULL) )
{   /* in a new second: failsafe test to ensure
* **we never loop to long if system clock
* **moves backwards on us  */
if ( DelayLeft-- < 0 )/* count down failsafe */
break;/* expired: stop */
time( &DelayLast );/* remember new time */
}

sleep(1);/* sleep a one second interval */
/* NOTE: signals, and other events, may result in
* **the sleep returning early. It may also sleep
* **noticeably longer than 1 second on busy systems.
* **In some environments simply counting sleep(1)
* **calls will not work, at least without a
* **lot of additional code. */
}

The sleep, while it may use interval timers, does not make the outer loop immune to time changes. Without the special test that watches for, and counts down, new seconds the system would suffer delays if the system time went backwards in the middle of the loop. Note that each backwards time change may result in one-second less of a wait before timeout.

Advancing the time in a forward direction may initially make the outer loop "timeout" in the for statement. If the time interval is small, such as when the local CPU time is corrected, the application should not have trouble if the timeout period is sufficiently large.

11. References

The references that were here were seriously out of date. Until this can be fixed please visit my latest date and time links, which have many more links beyond the scope of this document, at
http://www.exit109.com/~ghealton/.dates.html (dot dates dot html)

There are a number of Y2K links scattered throughout this document.

12. Credits

Thanks to the few, but wonderful, people that helped improve this document by sending me corrections, revisions, and interesting things. I really do appreciate it when someone shreds my document in a fit of serious proofreading. A few rounds of such corrections helps make my document much stronger. Technical documents of this size are very difficult to get correct without the help of third party readers happy to shoot at the document flaws.
Murisier Serge
(1999-09)
s.murisier[]cross-systems.com Comments about Microsoft Epoch dates.
"José Carlos Fernández Gutiérrez
(2000-02)
emejcfg[]
madrid.es.eu.ericsson.se
Reported problem in automatic formatting of date in title.
Valerie Kramer
(2000-11)
funzone[]harborside.com Correction to "Odd Day".
Jerome Fine
(2000-11)
jhfine[]idirect.com Lots of information about the tropical year and Leap Second drift. Including the difficulty in projecting future drift. Hopefully Version 2 of the document will have more of these comments merged into it.
Gordon Speer
(2000-02)
speer[]essex1.com Also caught my "Odd Day" mistake. First person to report Latitude / Longitude error.
Ian Galpin
(2002-01)
g1smd[]amsat.org Editor for the ISO 8601 Standard section of the Open Directory Project (ODP).
Many changes throughout the document, especially in areas concerning ISO 8601. Highlights of the corrections include:
  • Leap seconds added at same time around the globe.
  • Latitude / Longitude error spotted.
  • ISO stands for International Organization for Standardization.
  • Clarifying "T" versus space in ISO 8601 dates and times.
  • 4.2.3: corrections to reduction / truncation table.
  • Week Number corrections.
  • Roman Numeral corrections.
  • GPS updates
  • Whitaker's Almanac reference.
Ed Davies
(2002-01)
edavies[]
nildram.co.uk
Reported problem in ISO timezone offsets along with several general typos.
Dr. John Stockton
(2002-01)
(2002-07)
jrs[]
merlyn.demon.co.uk
  • Suggested inclusion of the mess they made of the early Julian calendar.
  • Reworded "shortest year" descriptions at his suggestion.
  • EU uses uniform savings time plan.
  • Astronomical dates added by request.
  • Provided fodder for 20/02/2002 in Odd Day.
  • Technical notes, and some corrections, about computer time keeping.
  • Additional information about GPS.
  • Time test hazards: described DOS timer at $40:$6C above $1800AF problem.

  • Julian! Julian!, Who Are Thou Julian? added by request.
  • "Some Transport organisation(s) use(s) a day from 03:00 to 27:00" demoted to rumor unless additional information is found.
David R Tribble
(2002-02)
david[]tribble.com
Thomas Scheidegger
(2002-03)
tscheide[]swissonline.ch
J. S. Connell
(2002-09)
ankh[]canuck.gen.nz
  • Additional information on Microsoft 100ms ticks.
  • Information on 1904-01-02 Macintosh Excel epcoch.
  • Assorted corrections.
Troy Goodson
(2002-12)
Troy_Goodson[]iname.com Use proper "Coordinated Universal Time", not "Universal Coordinated Time", as some major sites have.
Dan Kohn
(2003-02)
dan[]dankohn.com RFC-3339 brought to my attention
Daniel Biddle
(2003-07)
deltab[]osian.net
  • Missing comma in RFC-822 date and time format.
  • Some techcial details about how sendmail writes dates not following RFC-822.
  • Spotted used "jjj" for some ordinal dates... now using "ddd".
  • tzinfo supports both a posix zone to stay POSIX compliant (and ignore leap seconds), and right, which observes leap seconds.

NOTE: the strange use of the [] image in the previous E-mail addresses is to hide the addresses of these kind people from being harvested by evil E-mail robots that bomb any E-mail address they find with spam.

Hits since 2002-01-21:


Index Page

$Id: yrexamples.html,v 1.62 2003/07/09 11:39:27 ghealton Exp ghealton $

root@127.0.0.1 -- traps evil E-mail harvesting robots?
root@LocalHost  postmaster@LocalHost  webmaster@LocalHost