Summary: General summary of calendars, dates, and times from past to the present. This includes details on using the ISO 8601 standard in and out of computers. It ends with information for software developers wanting to properly use dates and times.
shortcut to
ISO 8601
standard
This document should be interesting to anyone interested in
dates, times, or calendars.
Readers may be regular people, managers,
or people that develop computer software.
The document has been written so that anyone may start at the beginning
and stop once they have had enough on the subject.
The document starts with a short history of calendars and timekeeping. Next comes a discussion of various standards, especially the "new" ISO 8601 that is gaining in popularity on an international basis. The final sections are for authors of computer programs. The technical sections start with general use of date and times and then advance to specific examples in different programming languages. While these final sections cover many programming languages, the main effort is for C and UNIX along with derived languages and operating systems, such as Linux, C++, and the Javas.
|
This document is under revision and subject to change (2002-12).
Indeed, work is (slowly) being done on a major revision of this document.
Some of the links that have expired since this document was first written have yet to be found and corrected by this author. Please visit http://www.exit109.com/~ghealton/.dates.html for the list of my current time and date links (dot dates dot html). |
While this document was originally written before the Year 2000 to help software developers cope with it, and much of the tense herein is past tense, the majority of the information still applies to new programs being written today. |
The information in this document is on a best effort basis. The author actively solicits additions, corrections, and clarifications. Because this documentation and code herein is free of charge, there is no warranty for the contents. All use of the information herein is at the user's own risk. |
Webmasters with links to this site should notify the author to be
advised of the update.
This is important as the new file will have a new file name.
Perhaps even a new domain name.
This list of registered references is included in
http://www.exit109.com/~ghealton/.home.html
(dot home dot html)
under References To My Web Pages.
Prefer links similar to:
|
Portions of this document copyright 1995-2002 by
Gilbert Healton
(ghealton@exit109.com)
All rights reserved. Permission is granted to reproduce the document, in whole or in part, for local use providing this copyright notice, this table box, and the title, including author's name, are reproduced in full and downloaders keeps their archives current by checking back at least once a quarter or registering with the author to be advised of any significant updates. |
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Trademarks this author is aware of use the capitalization associated with the trademark. Known trademarks follow. Click on the trademark to get the official Trademark Usage Guide from the trademark owner (if available on the web). Click on the owner name to get the owner's home page (again, if available).
- IBM is a registered trademark of International Business Machines, Inc.
- Microsoft, MS-DOS, Windows, Windows 95 and Windows NT are registered trademarks of Microsoft Corporation.
- UNIX is a registered trademark of The Open Group.
Please let the author know of any trademarks used in this document that are not listed here.
The section for general readers provide a history of calendars, times, and related background information many find interesting. The document starts with a section for general readers and gets more complex as it goes along. Therefore general readers may simply read until they've had enough on the subject. A few pieces of technical gibberish have been sprinkled throughout this section to provide important information to technical readers. Normal people may simply ignore these short bursts of strangeness.
The sections for programmers describe ways programs misuse or properly use dates and times. How to spot and repair bad date code along with writing good date code. Year 2000 problems are just one point covered herein.
[Y2K] leads information describing techniques frequently associated with Year 2000 problems. Clicking on the [Y2K] should advance you through a circular list of all [Y2K] notes in this document.
Y2K presents a unique challenge to the computer industry and its users. Never before has such an immovable deadline hit so much of the industry. Year 2000 can not be postponed if the software is late. Unlike typical bugs, which strike at random intervals or at those customers trying something new, Y2K can strike across the entire customer base, striking at the same time, at customers that haven't changed anything. Not just in one part of the program, but at numerous locations of many programs. Critical supplies from key vendors are also at risk at the same time, both for Y2K and overloading, which may reduce the resources you have available to you for repairing your problems.
The Roman Calendar, which the current Gregorian Calendar is derived from, was viewed differently by the Romans than we do today. Rather than counting days up in the month, they set two key reference points in each month and counted the days remaining until the reference point. The first of the month was the Kalendae (or Calends in today's desk dictionary, sometimes spelled as Kalends in other sources) while the middle of the month was the Idus (or Ides). The Ides was the 13'th day of most months and the 15'th day for months with 31 days.
One day before the Ides of March was March-14.
One day before the Calends of April was March-31.
(With the Year 2000 problem, we need to "beware the Calends of January".)
This author has also seen references to the Nonae (Nones) which seems to be the 9'th day before the Idesi (13th or 15th of the month), Adjust the previous confusion accordingly if you encounter this term.
The old Roman Calendar also considered the Calends of January to start a new year (personal observation: shortly after the winter days start to get longer). However, when it came along, the Christian Church changed this in 567 by order of the Council of Tours, for religious/political purposes. However, exactly what day years should start on was not well specified and varied from country to country and time to time. March was a popular month to start new years (spring time, when the earth came alive again), though there was no agreement on what day in March should start a new year. The Calends and Ides of March were among the more popular selections for March new years.
By 46 BC the Roman calendar was a major mess. Among other things the lack of leap years had gradually lost enough days to make the calendar claim spring, the start of a new year, the Calends of March, actually occurred in what was really late November, a few days before the Calends of December. Not a good time to plant your spring crops.
Julius Caesar, under the advice of the astronomer Sosigenes of Alexandria, issued a decree that lengthened 46 BC to 445 days to bring the calendar year into line with the solar year. This made the longest year on record.
As part of the 46 BC correction, subsequent years had an extra day added six days before the Calends of March every fourth year. Rather than making a separate day number, this month had two "six days before the Calends of March", or "two sixes", which the current word bissextile is derived from. Thus February-24 is historically the leap day, not February-29. In the American and the European community this is often called the "Old Style Date" or "Old Style Calendar".
Ignoring differing leap years, the months used in this calendar match the months in the current Gregorian Calendar.
Unfortunately, the priests of the time greatly botched leap year calculations after this calendar correction and it took a few decades to discover this fact. Claus Tøndering's Calendar FAQ details the aftermath of these errors lasting through 4 AD, which omitted 0004-02-29.
Every 128 years the Julian Calendar became off by an additional day. By 1263 Pope Urban IV received a letter from Roger Bacon urging a calendar adjustment to correct the current error and prevent future errors. In 1582 Pope Gregory XIII, working with the Council of Trent, acted on this solution when it was proposed again. The 10 extra days in the year were taken from October to restore the equinox to March-21.
The rules of 100 and 400 in calculating leap years were also added at this time.
Leap years are years evenly divisible by 4, unless the year is evenly divisible by 100. However years evenly divisible by 400 remain leap years.
In the American and the European community this is often called the "New Style Date" or "New Style Calendar".
While the Gregorian Calendar was surprisingly accurate, it was slow to be adopted by some countries, especially in Protestant countries. The Protestant Reformation started in 1517-October-31 and, for political reasons, often greatly delayed the adoption of the Gregorian Calendar in Protestant countries.
Effective in 1752 the British Parliament adopted the British Calendar Act of 1751, taking the eleven extra days from September (the British calendar became an additional day off in the year 1700, which British Calendars considered a leap year and Gregorian Calendars did not). The parts of American under the control of the British followed this change. The parts under control of France and Spain changed in 1582. Other places changed at other times. What a mess.
Along with dropping 11 days from September, the legal start of year, March-25, was moved to match the common practice of January-01. This made the stub year 1751 the shortest year on record: it consisted only of March-25 through December-31. It was only 282 days long
The second shortest year depends on the country you are in. Except for countries where the start of the year changed, such as England, the shortest year was during the Julian to Gregorian calendar change. The year length depended on how many days were dropped from that year. Corrections made during leap years usually resulted in an additional day being added to the year (normally 1752 would of been a leap year, but as this stub year did not have a February, it could not be a leap year). Changes made in 1582 resulted in years of 355 days. Changes in the 1700's made years 354 days long, or 355 days in leap years. Changes made in the 1800's produced 353/354 day years. Changes made in the 1900's or 2000's require years 352/353 days long.
The common practice of using the Pope's calendar in England before 1752 resulted in some confusion as to what year it was for the months of January, February, and part of March. To avoid confusion many colonial records used dates providing both the official and common year. A typical example is "5 Feb 1750/51".
After 11 days were dropped from 1752-September many people adjusted annual events based on previous dates, such as birthdays, to continue observing them a true year apart. This resulted in people writing both days and both years for some dates when referring to the "old style" dates. George Washington's birthday could be written as "11/22 Feb 1732/33".
The Year 2000 problem is not the first point of major confusion over calendars.
Countries like Greece didn't adopt the Gregorian reform until years as late as 1923, when they had to drop 13 days. Whitaker's Almanac contains a complete section on the differing calendars in use around the world and throughout the ages. Claus Tøndering's Calendar FAQ makes the dates of the changeover for many countries available on the web.
The Gregorian Calendar is by no means universal. Some places, or religions, still use the old Julian Calendar while others have their own historic calendars that continue to be used to this day. There are many such calendars in use today... well into three digit values.
The concept of months appears to have originally been based on the moon's "monthly cycle" (synodic period) of about 354.367 days Early calendars based on the synodic month quickly came out of sync with the year as a lunar month is about 29.5309 days.
Today's Gregorian calendar is based only on the period of earth's year around the sun. However, the Islamic calendar still uses the moon's period and not the sun's.
Lunisolar calendars use both the lunar and solar cycles. Every few years a whole month is inserted into the year to bring the calendar back into line with the solar calendar. The Chinese and Hebrew calendars are modern examples of lunisoloar calendars.
When astronomers run dates backwards before year 1 BC they simply use straight number signs. Rather than use 1 BC they use year 0. The year 2 BC becomes year -1, etc. This helps keep the computers doing their calculations happy.
The three other times to start a new day have been seen by this author. Morning, when the sun rises; noon, when the sun is high in the sky; and evening, when dusk starts. These are still in use today in different parts of the world, especially for religious or other special uses.
Noon is the historic start of day for the old Roman Empire. This author suspects that the decision to start a new day at noon had something to do with watching sundial shadows peak and fall at noon along with keeping all astrological observations at night, where the sky was much more interesting, in the same "day". What really is different between current and Roman days is that Romans tracked time with sundials using 12 hours of daylight and 12 hours of night. Therefore the length of a Roman "hour" was longer in summer and shorter in winter. Today's astronomers continue to use the noon-to-noon days when tracking astronomical days.
Rainfall and river flow in the UK are measured from 9am to 9am and ascribed to the day which contains the greater part of the time.
It's been rumored that some transport companies record days using 03:00 to 27:00 to allow customers going to their work day to be scheduled on the same transport day. This author would love to hear from anyone with definitive knowledge on the subject, historical or current.
How we track time also has changed. Long ago noon was when the sun was high in the sky, a purely local event. The local clock keeper of the town would ensure the clock matched the sun each day and everyone took their time from the official local clock. With the advent of faster travel, in particular the railroads, a need for a standard, predictable, time arose. Many long distances only had a single-track railroad and without electronic communications running trains strictly on schedule was vital to avoiding collisions.
In 1884 an international convention divided the world into twenty-four time zones. There is no international standard for the actual names or abbreviations of these zones. Each country is free to define its own names and abbreviations and RFC-822 defines its own set of names. Countries may declare their own compliance to time zones. Indeed some countries are 15, 30, and 45 minutes off of UTC. Historically even stranger offsets from UTC have been used. From 1909-May-01 through 1937-June-30 the Netherlands was exactly 19 minutes and 32.13 seconds ahead of UTC by law (such offsets can not be represented exactly in the ISO 8601 standard). Programs that need to process time zones must allow minute, even second, offsets. The time zones are not straight lines, but snake around to meet the needs of the local people.
Time zones center on the "prime meridian", which was originally defined as longitude zero, a line that bisected a critical part of the main telescope at the Royal Observatory at Greenwich in south-east London. All countries defined their time as some offset, positive or negative, away from "Greenwich Mean Time" (GMT). GMT is based on mean local time at the an agreed on zero longitude. This Mean Time at Greenwich England is not subject to daylight-saving rules (local Greenwich time, however, is subject to British daylight-saving time rules). The Americans are negative while Asia and most of Europe tend to be positive. See the National Institute of Standards and Technology Glossary for more information.
As this was basically an American and European convention, the problem of new days was dumped in the middle of the biggest body of water, the Pacific Ocean, twelve time zones away, on the other side of the world. This is the "International Date Line" and tends to be as far from different countries as possible. Crossing the International Date Line adds or subtracts a day from your local time, depending on the direction of travel. While written about by a few people for over a century, the first people this actually happened to, much to their surprise, were survivors of Magellan's crew on their 1522 return to Cape Verde Islands when the day turned out to be Thursday rather than the expected Wednesday.
The U.S. military, and various other international users, have set up a series of single-letter codes to represent different zone offsets.
M Y X V U T S R Q P O N Z A B C D E F G H I K L M
with "Z" ("Zulu") holding the prime meridian, or Zero offset. "Y" being "-11", "M" being "+12" (note the lack of the letters "J" and "W"). These letter codes do not work in countries with offsets not an even multiple of 60 minutes from Zulu (UTC) time.
RFC-822 attempted to implement these letter codes. While the text portion of the RFC described the offsets correctly, the sample numeric values had their signs reversed from the (Mil?) standard RFC-822 was attempting to model. Except for Z, which has an offset of Zero, the success of a program using RFC-822 would depend on what section of the RFC was used to write the code.
There are no international standards for "saving time". Unless bound by some agreement, each country is free to define its own rules on the subject.
Leap seconds may be positive or negative, adding or subtracting a second on the last minute of a day. On a "leap second" day the day ends at the completion of 23:59:58 for negative leap seconds and after the completion of 23:59:60 for positive leap seconds. The next second is "00:00:00" of the next day, which is also known as 24:00:00.
At the time of this writing the last leap second was 1998-December-31 (1998-12-31) and none is scheduled through the end of 2003-December (2003-12). This makes an unusually long dry period for leap seconds, but that's how the world turns. This may last into, or perhaps beyond, 2006. Perhaps.
Leap Seconds are added (or subtracted) at the same time around the globe at 23:59:59 UTC, whatever the local time may be, regardless of any local time zones.
Date TIME Location 1998-12-31 23:59:60 UTC 1999-01-01 08:59:60 Melbourne Australia 1998-12-31 17:59:60 Local Central USA time
Currently (2004-03) the ITU SRG 7A group that deliberates the future of UTC is considering replacing UTC and its leap seconds in the year 2022 with a new "Temps International" (TI) time when UTC is 50 years old. TI time is not linked with the earth's rotation and would require changing local time zone offsets every few centuries to keep within an hour of daylight times (leap hours?). Time zones maintaining daylight savings time could eliminate the extra hour at the end of savings time for that year's correction.
In 1992 the European Committee For Standardization adopted ISO 8601 under the standard EN 28601 to end the traditional confusion involving periods, slashes, the order of the numbers, and other date formats formerly used throughout Europe.
Companies that wish to do business on a global scale are best served by avoiding date formats local to their own country. Distributing local date formats on a global basis tends to cause confusion to people in other countries. For web pages being read by world wide audiences, using ISO 8601 format dates and times seems the only sane way to present dates and times. One Internet poll decidedly favors using ISO 8601 format for date and times on web pages.
A growing number of individuals and companies are changing from writing dates in their traditional local format to using ISO 8601 compliant dates and times (e.g., yyyy-mm-dd hh:mm or yyyy-ooo hh:mm) in everyday documents. This is especially true for documents used in international trade. ISO 8601 week numbers are also increasingly being used to specify specific weeks.
The growing availability and popularity of low cost digital watches and the increasing use of computers is making the hh:mm and hh:mm:ss formats adopted by ISO 8601 for time well known throughout the world.
The United Nations Economic Commission for Europe, Working Party on Facilitation of International Trade Procedures, Recommendation 07, The Numerical Representation of Dates, Time and Periods of Time, is based on the ISO 8601 standard. Visit UN/ECE Trade Facilitation Recommendation 07 at http://www.unece.org/cefact/rec/rec07en.htm for more information.
NOTE: While the following interpretation of ISO 8601 should be sufficient for the vast majority of users, it is still only a summary. The actual standard is much longer, repeats many details, and should be consulted for full details. Major corporations or people developing applications particularly sensitive to date time standards should purchase an official copy of the standard rather than rely on comments or free draft copies.
Those who want more, but without the standard, may vist RFC-3339, which defines a date and time format for use in Internet protocols that is a profile of the ISO 8601 standard for representation of dates and times using the Gregorian calendar.
NOTE: Because ISO 8601 is so vast and complex many companies are issuing their own standard that follows only a selected subset of the full standard. In practice this tends to be use the yyyy-mm-dd format with an hh:mm or hh:mm:ss format for time. When more accuracy is needed, the hh:mm:ss,nnn format (note comma) allows whatever precision is appropriate to be used. Trailing zeros are always added to have all times use the same number of significant digits.
The ISO 8601 format is large and complex and sets forth many ways of representing dates and times. The three major forms of date and times defined by ISO 8601 follow. In practice only the first two formats enjoy general use.
yyyy-mm-ddTHH:MM:SS
yyyy-dddTHH:MM:SS
yyyy-Www-dTHH:MM:SS
Note that hyphens separate date fields and colons separate time fields. Dates using these characters are said to be in "extended format". The fields are:
- yyyy-mm-dd
- Year (usually four digits, but see "truncation", in section 4.2.3), month, and day. When present, month and day are written using two-digit numbers, with leading zeros as appropriate. Month is 01 through 12 and day is 01 through 31, as appropriate.
printf( "%04d-%02d-%02d", d.tm_year+1900, d.tm_mon+1, d.tm_day );NOTE: As a partial transition to ISO 8601, a number of web sites use the month name or abbreviation in a date when changing their dates to use year, month, day formats. The month names January through December and the weekdays Monday through Sunday are acknowledged in 4.3.2.1 of ISO 8601:2000 for reference, but not for official use. Abbreviations of these names are not mentioned in the standard.
NOTE: the standard requires that years on or before 1582 be avoided except by mutual agreement. For practical efforts the year 1752 may provide a safer limit. The standard assumes the Gregorian calendar is run backwards as if it was always in use (a "proleptic" calendar). This is unlikely to be acceptable to many uses involving historic dates. Historians discussing early dates tend to use the Julian calendar, and for dates before 45 BC, a proleptic Julian calendar.
Thus proleptic years need to be considered an agreed upon, but imaginary, calendar when referring to dates before the Gregorian calendar correction came effective. Such dates need to be converted to the appropriate local date.
- yyyy-ddd
- Year and Ordinal Day number (001-365, 366 on leap years: always three digits).
printf( "%04d-%03d", d.tm_year+1900, d.tm_yday+1 );- yyyy-Www-d
- Year and Week Number (01 to 52 or 53). The week number may be followed by an optional "day of week" number with 1 being Monday and 7 being Sunday.
printf( "%04d-W%02d-%d", d.tm_year+1900, Week, WeekDay );- T
- The literal character T is used to separate the date from the time of day when combining dates and time. While an upper case "T" is the letter of choice a lower case "t" may be used if the upper case "T" is not available in the character set. See "Truncation and Reduction", in section 4.2.3, for details of omitting the T from dates and times.
If only a time field is present, a T may be placed before it to identify it as a time: T010203
While replacing the "T" with a space makes the date and time much easier for human eyes, and appears to be a very common practice, technically it is a violation of strict ISO 8601 notation (see ISO 8601:2000 4.4). This is true for spaces embedded anywhere within a date or time. However,
While the original scope for ISO 8601 was for computer to computer "Information Interchange", ISO 8601 is now being used in many areas the original standards committee never considered when the standard was designed. Thus an out of specification space character is acceptable for displaying dates for people as computer to human transfers are not strictly within ISO 8601's scope. Even on computer transfers this date can be made in scope if the date and time are considered two separate, but related, values.
Suggestions from this author for character format dates and times: Use the "T" when the prime audience for dates and times are other software packages and space for dates and times read by humans. When using spaces, ensure you word the specification to state/imply that separate date and time fields are being used rather than a single date and time field to avoid being accused of being an idiot by ISO 8601 purists.
- HH:MM:SS
- time stamp, in a 24-hour clock. Typically each portion of the time is written using two-digit numbers, using leading zeros as needed.
printf( "%02d:%02d:%02d", d.tm_hour, d.tm_min, d.tm_sec );Fractional time can be represented using decimal notations after the HH, MM, or SS. Typically after the SS. The character ISO 8601 prefers to use for the decimal character is a comma (,). The only other choice is a period (".", e.g., "full stops"). Two digits will always be found to the left of the decimal sign. Once a fraction is started, subsequent MM or SS fields, if any, must be omitted (e.g., 15,5:30 is invalid).
printf( "%02d:%02d:%02d,%03d", d.tm_hour, d.tm_min, d.tm_sec, (int)ftime_time.millitm );
printf( "%02d:%02d:%06.3f", d.tm_hour, d.tm_min, FloatSecond );
(NOTE: do not use %g notation!)
More Information: There is quite a lot more to ISO 8601, some of which is covered in this section. See the standard if you need serious or legal details. This author maintains a list of locations copies of ISO 8601 may be accessed at in ISO 8601 Commentary Links. The list is kept list outside this document due to the constantly changing nature of the list.
Leap years are years evenly divisible by 4, unless the year is evenly divisible by 100. However years evenly divisible by 400 remain leap years. ISO 8601 considers the proleptic Gregorian year "0000" a leap year
See Gregorian Calendar and Leap Years", 3.3, for the history of Leap Years.
Leap seconds add or subtract a second to a day. Positive leap seconds added to a day are denoted 23:59:60 and occur a t day it is on. This explicit ISO definition of 24:00 being 00:00 of the next day is not well known.
If 24:00 must be used, only use 24:00 as an ending time. Events using 24:00 should not actually be in, billed to, cross over, or otherwise use, midnight. Rather, treat 24:00 as an instant after the completion of the current day (instant after 23:59.59.9999999...).
Basic versus Extended Formats: The "basic" format provides the minimum characters needed for the desired precision (e.g., 19991231T235900). The "extended" format adds additional separator characters, by specific rules, to make fields easier to read (e.g., 1999-12-31T23:59:00).
Expanded Formats By mutual agreement of all parties the number of digits in a year may be expanded to record years with values greater than 9999. Further, a plus (+) or minus sign (-) must precede the year's value to indicate if the year is A.D., which uses plus, or B.C., which uses minus. The use of a sign appears to be required, even if only A.D. (positive) years are being used.
This expanded year format may be used in any year field. The number of digits in a year must be part of the mutual agreement.
Great caution needs to be observed when using expanded formats as ISO 8601 requires the Gregorian calendar to be used for all specifications. The break between Julian and Gregorian is not acknowledged by ISO 8601. This makes early dates, especially those before the local calendar conversion, suspect. It is common practice by historians to use Julian calendars before 1582. See the notes in Date Formats, 4.2.2, about "proleptic" calendars for more details about this problem.
Week Numbers: ISO 8601 also allows for week numbers to be tracked using a yyyy-Www notation rather than using month and day numbers. Week "01" is defined as being the first Monday through Sunday week that has a Thursday in it (author's hint: weeks containing January-04 are always week 01).
Day numbers in a week (ISO ordinal day numbers) start at 1 for Monday and
run through 7 for Sunday.
There are always three days before Thursday
and three days after Thursday in ISO weeks.
printf( "%04d-W%02d", d.tm_year+1900 );
printf( "%04d-W%02d-%1d", d.tm_year+1900, WeekNumber,
DayInWeek );
Notes:
In the following tables the left table shows every possible ending week for an ISO 8601 year. Each row in this table represents one of the possible years. The corresponding row of the right-hand table shows the first week for the following ISO 8601 year.
Last week of prior year (December) M Tu W Th F St Su 22 23 24 25 26 27 28 23 24 25 26 27 28 29 24 25 26 27 28 29 30 25 26 27 28 29 30 31 26 27 28 29 30 31 1 27 28 29 30 31 1 2 28 29 30 31 1 2 3 ==>
First week of following year (January) M Tu W Th F St Su 29 30 31 1 2 3 4 30 31 1 2 3 4 5 31 1 2 3 4 5 6 1 2 3 4 5 6 7 2 3 4 5 6 7 8 3 4 5 6 7 8 9 4 5 6 7 8 9 10 Examples: the date 2001-Dec-31 (a Monday) is considered to be a part of the first week of 2002 in the ISO week calendar. This is written as 2002-W01-1 (that is "Year 2002, Week 01, Day 1"), even though in the Gregorian calendar it is actually the last day of 2001. The next day, 2002-Jan-01 (Tuesday), is therefore 2002-W01-2 ('Year 2002, Week 01, Day 2').
Time Intervals: ISO 8601
also allows intervals between two points in time to be specified.
These intervals may be associated with actual time points
or they may be more abstract without being associated with specific time points.
Intervals may include any valid combination of dates and/or times.
Recurring intervals may also be specified.
P2Y3M5DT8H4M3S
1999-12-31T23:59:59/2000-02-29T24:00:00
1999-12-31T23:59:59/P0Y1M29DT1S
P0Y1M29DT1S/2002-02-29T24:00:00
P0004-05-06T12Full truncation and reduction seem to be allowed with this alternative format.
P0004-156T12
P0004-W22-2T12
R8/P1Y
R6/P15W
R0/P12D
R1/PT12H
The ISO 8601 standard also allows R to be specified with intervals providing a specific time point:
R4/1999-12-31/2000-12-31
This format identifies the duration of the first period (1999-12-31/2000-12-31) along with the number of times it is to be repeated (four times: R4). The above sample is the same as R4/1999-12-31/P1Y.
Time zones: Time zones may be specified after times by placing a time zone indicator after the time. A zone indicator of Z represents zero offset for UTC (also called "Zulu" time some places outside of ISO 8601). A notation of =hhmm (basic format, =hh:mm extended format), where "=" is a sign ("+" or "-") and hh mm an interval, provides the hours, and optional minutes, offset from UTC. See section 3.6.2, Time Zones for details on time zones.
Tip for times intended for use internally by software or in an international distribution by people. Rather than encode time zones into a time, simply convert the time to UTC and provide a zone name of Z. A zone of UTC may be used for documents read by people as the average person will not understand Z. Let the readers convert UTC to their local time. Astronomers and UNIX systems have been using UTC for years. For software developers the zoneinfo files provide a standard way of manipulating time zones. In international distribution time zone names can be near useless and numeric offsets more confusing than a simple UTC time. Note that this document uses such a revision time in its heading area.
ISO standard 9945-1 has established a series of standard names for time zones. While this is an ISO standard, it is not used by ISO 8601 in any way. The ISO 9945-1 names use longer words rather than three-character abbreviations, and would not be understood as zones by average readers. This standard seems to specify a standard way of specifying time locales to operating systems rather than time zones to people. The zoneinfo files appear to use them. Information on ISO 9945-1 time zone names may be found by visiting:
http://www.bsdi.com/date/Truncation and Reduction: ISO 8601 allows you to reduce the precision used to recorded dates and times. The general rule for ISO 8601 is to properly replace leading fields with hyphens if their values are not desired. While "truncation" must be strictly done in a left to right manner, ISO 8601 rules for these extra hyphens have a few quirks in them. See the following table.
Low order fields may simply be dropped from right to left (reduced) without using any special characters.
Examples of truncations and reductions follow. Rows in italic type note conditions of special interest. People in a hurry should read these even if they skip the others. Truncation requires mutual consent by all involved parties. A format that ensures that all truncated values can be uniquely recovered is essential. Where there is no risk of confusion users may also agree to omit unwanted leading hyphens (again, see the table for quirks). Truncation of century digits of years have been a fertile source of confusion. Caveat Truncator.
The T between the date and time may be omitted in most cases. If a T is present it must be followed by a time. If only a time is given, the T may precede it to better indicate it is a time. The T must be present in truncated dates with reduced times.
Date / Time Examples Truncated Reduction Extended Basic 1999-12-31 1999-12-31 19991231 99-12-31 991231 Century[1] --12-31 --1231 Full year n/a --12 Full year Day n/a ---31 Full year and month n/a 1999-12 Day[2] -99-12 -9912 Century[1] Day n/a -99 Century Month and day n/a 1999 Month and day n/a 19 All but century 1999-365 1999365 99-365 99365 Century[1] n/a -365 Century and year[3] 1999-W52-5 1999W525 99-W52 99W52 Century Day in week -W52-5 W525 Full year -W52 W52 Full year Day in week n/a W-5 All but day in week 23:59:00 23:59:00 235900 -59:00 -5900 Hour n/a --00 Hour and minute 23:59 2359 Seconds n/a -59 Hour Second n/a 23 Minutes and seconds 23:59:00,153 n/a --00,153[4] Hour and minute
- When dates and times are combined, only the date can be truncated and only the time can be reduced.
- n/a indicates ISO 8601 explicitly declares this use "not applicable". Usually to avoid confusion between two abbreviations of the same format but containing different fields.
- Section 4.6 of ISO 8601:2000 covers "truncation". Section 5.2.1.3 of ISO 8601:2000 expands on truncated representations. By agreement the leading hyphen may be omitted when truncating only the century providing the day number is not also being dropped by reduced precision. Once any other portion of the date is omitted by reduction, the leading hyphens to indicate omitted fields, especially omitted century digits, seem to be required.
The sections on truncation have wording that has generated considerable confusion and discussion. Archives of some discussions can be found at:
http://groups.yahoo.com/group/ISO8601/message/189
http://groups.yahoo.com/group/ISO8601/message/197Earlier versions of the standard had points covering truncation of century digits that this author found particularly confusing. Readers of earlier versions of this document based on the earlier ISO standard may observe major changes to century truncation rules.
When negative or expanded years are possible, this author will never want to truncate a year.
- When only the day is omitted, a separator must be placed between the year and month. This is the only place basic format uses a hyphen separator. because the six digits, "121212", are to be read as 12-12-12 (YYMMDD), not 1212-12 (YYYYMM).
- While --365 could be used (double hyphens), the first hyphen is redundant and therefore omitted.
- The comma or full stop (period) between the whole and fractional digits is not a separator character as hyphens and colons are. Rather the whole and fractional portion of the number make a single field.
In this sample the agreed precision has added 1/1000'th of a second to the time. Trailing zeros are required to keep the fraction the same number of digits in each recorded time.
Another date and time standard frequently used within the computing community, especially the Internet community, is included in RFC-822, which provides standards for date and time used on Internet type networks. RFC-822 dates are in the format "Tue, 16 Feb 99 17:56:23 EST".
RFC-822 is no longer used for new development as it has been replaced by RFC-2822. Though, when many people casually refer to "RFC-822" they are really refering to RFC-2822. This author believes some reasons RFC-822 fell out of favor include its inherent Year 2000 problems, "English Language" flavor, time zone problems, and the introduction of ISO 8601). A short summary follows. And if you find something called RFC-822 that doesn't look quite like this, check out RFC-2822 format.
[ Day, ] dd Mmm yy HH:MM[:SS] zone
These fields are:
- Day
- Optional day of week (e.g., Sun, Mon, Tue, Wed, Thu, Fri, Sat), followed by a comma (tip to computer programmers: consider the comma optional when reading dates, but always generate them if the day of week is present). Always in English.
- dd Mmm yyyy
- Date. E.g., 07 Nov 99.
This standard always uses a two-digit year. In practice this should be interpreted as the closest year to the current date. As RFC-822 is concerned with the transmission of information over networks there is no need to track periods over 99 years in length.
printf( "%2d %3s %02d", d.tm_mday, MonthName[d.tm_mon], d.tm_year%100 );- HH:MM:SS
- time stamp, in a 24-hour clock. Seconds is optional.
printf( "%02d:%02d:%02d", d.tm_hour, d.tm_min, d.tm_sec );
printf( "%02d:%02d", d.tm_hour, d.tm_min );- zone Time zone information.
- Provides offset from UTC to local time in the standard [+-]hhmm format or one of several standard names.
IMPORTANT NOTE: while RFC-822 defines one-letter time zone codes as described in section 3.6.2, the signs of the numeric offsets were reversed. Thus programs should avoid the one-letter codes with RFC-822 time zone offsets as the receiving program may process it exactly the opposite way you expect. Z (zero) is the only reliable code.
Note that years are always two-digits and must be century corrected when read.
RFC-2822 is a big improvement over RFC-822, but still not as good as ISO 8601.
[ day_of_week, ] Day Month Year HH:MM[:SS] Zone
These fields are:
- day_of_week
- Weekday name (Mon / Tue / Wed / Thu / Fri / Sat / Sun). If present, this name must be followed by a comma. Software developers should see the note in RFC-822 about this comma.
- Day
- One or two digit day.
- Month
- Month name (Jan / Feb / Mar / Apr / May / Jun / Jul / Aug / Sep / Oct / Nov / Dec).
- Year
- Year: four digit year.
- HH:MM:SS
- Time of day. Each field being two digits in length. The :SS field providing seconds is optional.
- Zone
- Time zone offset from UTC. In the form of a sign character (+/-) followed by a four digit time providing hours and minutes of the offset. A colon is not used in this field.
A typical RFC-2822 value follows:
Wed, 18 Jul 2001 11:54:46 -0400
NOTE: the timestamp placed on the initial "From " line of mail box files ritten by the sendmail program are in the format "Mon Mar 15 10:25:32 1999", which does not follow RFC-822 or RFC-2822. They are generated by calls to the asctime() function. As this date is not sent over the network it does not need to comply with RFC-[2]822.
While the field order is the same the following fields have different values:
- year
- Two digit value. Values of 00 through 49 represent 2000 through 2049 and values of 50 through 99 represent 1950 through 1999. Values of 100 and beyond represent year 2000 and beyond. Please note that this date window of 49/50 is different from the standard UNIX date window of 68/69.
- zone
- A value of GMT or UT for UTC, otherwise a series of values representing different time zones. EST / EDT / CST / CDT / MST / MTD / PST / PDT / Z
Like RFC-822, all single letter time zones, except Z, are untrustworthy.
Some E-mail messages may use other time zone formats.
This dating system is not to be confused with the "Ordinal Date" calendar, often, but incorrectly, also called the Julian Calendar. This calendar tracks years like the Gregorian calendar, but uses the day number within the year: day 1 to 365 (366 in leap years), for the day. There is no concept of months.Julian Day one was determined by combining three commonly used cycles of 28 (solar cycle), 19 (Golden Numbers) and 15 (indiction cycle) years to obtain a larger cycle of 7980 years, which is the least common multiple of the cycles. Traditionally credited to Justus Scaliger (1540-08-05, France) it was also discovered earlier by others. Running the Julian Calendar backwards from the current date to the start of the current "great cycle" avoided the need for negative numbers for years along with providing a uniform way of measuring dates and times during a period of accumulating calendar errors.
Please visit http://hermetic.magnet.ch/cal_stud/jdn.htm for more information about Julian Days.
Today Julian Days are used in various scientific or other extended calculations. As long as you are willing to ignore the 1582 (1752 in Britain and its colonies) calendar correction, staying within recent years, it is surprisingly easy to transform Julian days to "Gregorian" calendar dates.
Chronological Julian Days CJDs shift the start of a day backwards 12 hours to match our current concept for the start of each day and follow local savings time conventions. Thus the CJD day zero started at Julian 4713-January-01 00:00:00 BC. When most people refer to "Julian Days" they are referring to the chronological kind. The more casual the reference the more likely it is the author did not even know of noon-to-noon days. Ignoring savings time and calendar corrections:
ChronologicalJulianDay = JulianDay + 0.5
JulianDay = ChronologicalJulianDay - 0.5
A Modified Julian Day shifts midnight like Chronological Julian Days do, but also adjusts day numbers to start at 1858-November-17 to obtain a date no more than five-digits in length. This is accomplished by subtracting 2,400,000.5 from the JD to produce MJD one (not zero). The resulting day numbers are more manageable to work with when using large numbers of dates. MJDs strictly follow UTC and are not subject to savings time or calendar correction events. MJDs are widely used in the scientific world for logging events.
IMPORTANT QUESTION, ALERT, and QUESTION
MJD values have been five digits in length since 1886-April-03
and a great many date MJD transmission standards
expect and require five digits of MJD.
Yet the authoritative definitions I have been able to
find all define MJD is JulianDay - 2,400,000.5 and not
( JulianDay - 0.5 ) modulo 100,000.
On Sunday 2131-August-31 the Julian Days will reach 2,500,000,
overflowing five digits.
This presents following questions:
|
This reminds me of C's struct tm tm_year field overflowing to 100 in year 2000.
Truncated Julian Days are also in use. This is the MJD truncated to four digits (remainder of MJD divided by 10,000) producing a more manageable four-digit date. The cycle returns to zero about every 27.4 years. The first TJD "day zero" was 1968-May-24 (a Friday). The second TJD cycle started on 1995-Oct-10. The third cycle will start on 2023-February-25. Computer programs using TJDs must take all due care to avoid problems when the cycle wraps. TJDs were created by NASA during the times of Apollo moon launches and have been picked up for use in other areas.
Centuries follow the same conventions millennia do. We need a "20" in our century digits before we can close out the 20th century.
If you are still not convinced, try to think in Roman numerals like the early calendar designers did. The first year was year I. Each series of numbers starts with a single I: I through V, VI through X, XI through D, DI through C, and so on. Therefore the last year of the first century was year C. In the "natural" order of Roman Numerals a single I is needed in the first year of a new century. Therefore year M ended the first millennium and MM ends the second millennium, with year MMI starting the third.The first year of the Christian Era was dated year I and was preceded by year I B.C.. While some calendar calculation formulas consider year "0" to be what is today commonly called "1 BC", the official convention for publication is still "1 BC". Remember, the Romans did not use place-holder mathematics like we do and therefore did not have a number for zero. Roman numerals were still being used in the medieval times our current dating system was invented in.
However, decades seem to start when the one's digit returns to zero.
While people still wring floods of "facts" out of numbers you won't read them here. However, a few fun tidbits follow.
The UK uses the D M Y order for writing dates.
Therefore Feb 20 was be digitally palindromic, 20/02/2002;
and, in the evening, the UK will have times of 20:11:02 .. 20:55:02 -
including 20/02/2002 20:02:20.02.
AUTHOR'S REQUEST: I would like to know of any official name for this type of date. The same for the following sequential numbers. ghealton@exit109.com
The first "Even Day" after the Y2K roll over was 2000-02-02, the first since 888-08-28 (sic.). The date of 01-01-01 was also numerically interesting.
- Caius Julius Caesar (a.k.a., Gaius) 0100BC/0044BC:
Pontifex; Emperor; Dictator; Invader; Commentator; Calendar Reformer; Assassinee.Date involvement: the Julian Calendar and the month July.
Calender of 365 days, with every four years being a leap year.Also know from: the Shakespeare play.
- Julius Caesar Scaliger: 1484/1558
Father of Joseph Justus Scaliger (1540/1609) of France.Date involvement: The son JJS introduced the Julian Period.
Julian Period of (7980 years) and the Julian Day Number (day count from BC 4713-01-01 12:00 GMT = 0.0), from which MJD, CJD, TJD are derived.It is said that these are named in honor of his father JCS; but others claim it was a reference is to the Julian Calendar. Perhaps both.
- Misinformed:
Julian Date is frequently, and improperly, used for ordinal date. Term is propular within the IBM mainframe community, but IBM may well have adopted the term from another source.
A date using a year with an ordinal day number from 1 through 365, 366 on leap years, within the year.
The author wishes to thank Dr John Stockton and his contribution at http://www.merlyn.demon.co.uk/moredate.htm#Jul for this information.
All the assessment in the world can miss bugs like this, making it necessary to test critical paths of programs you inherit.
NOTE: the original UNIX definition for %y of date calls for year-1900 to be returned while the current POSIX standard calls for the last two-digits of the year to be returned. Thus date commands returning 100 for %y in the year 2000 are not Y2K compliant. Your tests should look for this problem value of 100 during year 2000 and values greater than 100 in following years.
Many of the "solutions" used by programmers to keep programs living in an after the year 2000 did not truly fix the problems. They only put the software on life support systems that only pushed the bug back a number of years. Companies would be well advised to determine if they have any of these time bombs lurking in their corporate code. Finding near term problems may be of critical interest. The sections on Y2K repairs should help developers spot such fixes and determine the life span of the "repair".
COBOL, RPG, and some other languages naturally allow programmers to explicitly declare variables are to be kept as two-digit values. These uses are very aggressive in keeping two-digit values as two-digits. Always and forever. Many other languages can do the same thing, though it may take more effort. Anything that uses "BCD" (Binary-Coded Decimal) to hold two-digit years must be considered guilty until proven innocent.
In these rules 99+1 yields not 100, but "00", the low-order two digits of the value. The carry of "1" that makes 100 is lost.
This wrap back to zero adversely impacts all date calculations and tests. There are many variations on the theme that depend on the programming language, operating system, and applications the program makes use of. Some sample problems include:
Sadly the scope of Y2K problems were not limited to two-digit years. Even modern programs in modern languages, such as C, perl, and Java, can have problems with year 2000. This author has personally seen bad code in new programs written as late as 1998, and will not be surprised if he finds some problems in 1999 code. Section 8, Bad Coding Techniques, goes into the technical details.
A well-known work around exists for operations where it is not vital for the computer to have the correct date. This is may work for industrial, and even many office, applications that are not networked or are sufficiently isolated from other major networks. If your operating system allows it (sorry Microsoft fans), set the date back 28 years. This returns to a previous point in the calendar cycle where the days of the weeks, leap year calculations, etc., is the same as the current year. The current 28-year cycle is valid for the years 1901 through 2099.
If you "setback" the dates in your database 28 years, and manually
subtract 28 years from all dates entered into your software, etc.,
you may be able to continue using old software with a
"cosmetic problem" of the wrong year being printed on reports.
This is actually being done in places until they properly fix their
problems.
(Do you really care if, say the power companies computers,
believe the year 2000 was 1972 as long as you got power in 2000?)
Note: some businesses may not be able to do this for legal reasons
if the reports are legal records, etc.
Any date setbacks should have been started months before the year 2000 to debug them before the panic of 2000.
Using date setbacks have their own risks. Most of the problems, and defenses, involved in (9) Testing Applications By Changing CPU Dates also apply to date setbacks. This includes, but is not limited to,
NOTE: the Network Time Protocol software generally ignores years transmitted in the type packets, assuming the CPU's local clock has been properly tracking the current year. Any times obtained from other systems with NTP may therefore preserve the setback, advancing to the next year as appropriate. Test your system for this if you rely on NTP.
At the dawn of age of business computers, computers simply took over the reading of Hollerith cards. To make switching to computers as easy as possible, most companies kept the same card format for the computes as they did with their EAM machines. Back then computer memory was very expensive, well exceeding $1 (U.S.) a byte. Add to this the huge amount of labor required for changing the existing programs using the files and retraining of personnel you get a huge amount of money. Spice this total with sheer corporate inertia and you get a recipe for sticking with two-digit years long after card systems vanished at a company. The designs of some languages, such as COBOL, made this very easy to do.
These early computers did not have any concept of time. At that time if a program needed to wait a specific amount of time, the programmer would send the computer into a short loop of instructions that did nothing but waste time. How long the computer stayed in the loop determined how long the wait was. Typically the computer could not do anything useful while in the time wasting loop. Moving the program to a faster computer could make the waits unacceptably shorter.
The next trick this author is aware of was adding "real time" timers to the computer to allow software to track how long various processes the computer was performing or watching took. The timers also allowed computers to track time of day and dates. As these early computers were slow, the timers tracking "wall clock" times were often driven by the frequency of the incoming line power. This made them independent of the inconsistent clock frequencies in the computer's hardware. Sixty-hertz became the standard in the United States. Fifty-hertz in much of Europe.
The next step was to add quartz timers to computers to split times into much finer fractions. Use of batteries allowed computers to keep time when powered off. (As the power companies spends millions of dollars to ensure the long term accuracy of line power is very good, even to this day, without special software adjustments, most modern quartz crystal clocks will drift from true time faster than the old-style line frequency timers.)
While people may look at 2038 and say, "that is 36 years away, why worry?", this is what people said in the early 1900s when setting up EAM machines to use two-digit years. Even if the machines and software do not survive, data formats have proven to have much longer life spans.
The time_t values are "binary" time values. As such they are "Year 2000 safe", at least until programmers start doing something dangerous with them, like dividing or taking remainders of divisions. Addition and subtraction for small values are generally safe.
#!/usr/bin/perl ### demonstration of UNIX "2038 Bug" $EndOfTime = 0x7FFFFFFF; #last value valid as +signed 32-bit $Format = '%04d-%02d-%02d %02d:%02d:%02dZ' . "\n"; @Gm = gmtime( $EndOfTime ); #break down last moment printf $Format, $Gm[5]+1900,$Gm[4]+1,$Gm[3],@Gm[2,1,0]; @Gm = gmtime( $EndOfTime + 1 ); #beyond last valid moment printf $Format, $Gm[5]+1900,$Gm[4]+1,$Gm[3],@Gm[2,1,0]; 2038-01-19 03:14:07Z 1901-12-13 20:45:52Z
Leap seconds (section 3.6.4) may make some small adjustment to the actual time the bug expresses itself.
Some programmers cast (i.e., copy) time_t values to "long" or other data types not properly compatible with time values. This is a cavalier assumption that can not only limit the portability of the program to different environments, it can also cause the code to mysteriously break with future compiler enhancements.
The remaining members of struct tm defined by the ISO/ANSI C standard follow. tm_hour, tm_min, and tm_sec store the time of day using a 24-hour clock. tm_wday provides the day of the week (Sunday=0). tm_yday provides the day of the year (0-364//365). tm_isdst provides a daylight saving time flag (positive if true, zero if standard time, negative if unknown/unsupported).
The actual order of fields within the structure is implementation dependent.
Other operating systems track time in other ways. Nothing is universal.
Interval Timers: as a granularity of a whole second is not sufficient for most operating systems the common practice is for an operating system (Windows, DOS, Linux, etc) to read the clock very carefully during boot. Before reading the RTC the OS will have previously started a higher resolution timer that can be used to track elapsed seconds. Such timers are often called "interval timers" and are used after the RTC is read to track the time of day with resolutions much higher than the RTC provides.
Different operating systems make different use of the RTC and interval timers. See your local operating system for details. Indeed, changing the time on some operating systems only updates the interval timer time and not the RTC. If the system keeps reverting to the old time on each boot this may be what is happening. Either find the command that sets the RTC or set the time from the BIOS on the next boot.
Epoch Year Microsoft Product Pivot Year Special Notes 0001 Microsoft .NET Framework Class Lib n/a 100nsec ticks since 0001-January-01 A.D. 1601 Microsoft NT/2000 (Win32) n/a 100nsec ticks since 1601-January-01 for a range of about 29,200 years 1899 Visual C++ DATE n/a 8-byte floating point value providing days since 1899-December-30 1900 Excel dates for Windows. 19/20 (cell entry)
DATE function n/aVT_DATE: days since 1900-January-01 (which has a value of 2... buggy programs accepting 1900-February-29 start at 1).
Q95948 Q2150941904 Excel dates for Macintosh. 19/20 (cell entry) Days since 1904-January-02 (value 1, see all notes for Windows Excel).
Switch between Mac. and Windows epochs by using Tools, Options, Calculation, and selecting or clearing the "1904 date system" checkbox.1980 MS-DOS{MarkR12b} FAT File Systems and DATE command. 00/99 1980 Windows 3.11 n/a
Other products may use different epochs. Please E-mail me epochs for any major Microsoft product not listed here.
While the time format ostensibly has a resolution of 100 ns, no Win32 platform actually keeps time at that resolution. Windows 2000's clock is only good to 1/64th of a second (15.625 ms), the length of a clock tick. Other Win32 platforms have clock ticks of different lengths, and some can even change the length of a tick on the fly (!).
The Win32 API provides two functions for retrieving the current time, GetSystemTime() and GetSystemTimeAsFileTime(). The former retrieves the current time broken down into a SYSTEMTIME structure (analogous to struct tm), and the latter retrieves it as a 64-bit integer stuffed into two DWORDs in a FILETIME struct. (With the latter function, you can pass it an __int64 variable, with a typecast.)
However, when operating on a 64-bit timestamp, careless mixing of 32- and 64-bit quantities in a single expression in C/C++ can lead to the truncation of the 64-bit timestamp to 32 bits. There are a few easy things you can do to avoid the truncation, however:
While this may smack of a "belt-and-suspenders" approach to programming, this causes endless problems to people who are not careful.
If your code needs to be acceptable to more compilers than just Microsoft Visual C++, the following preprocessor incantations will probably be useful:
#if defined(WIN32) #define _64(x) x ## i64 typedef __int64 int64; typedef unsigned __int64 uint64; #elif defined(__GNUC__) #define _64(x) x ## LL typedef long long int int64; typedef unsigned long long int uint64; #else #error woops, need to fix _64() for this compiler! #endif
The kicker is that the TOD is incremented at some decimal increment, such as every millisecond, rather than a true binary fraction. The result is a "long second" where the second is incremented once every 1.048576 seconds.
Individual programs and languages that run on IBM mainframes have large libraries of time and date options. See specific programs and languages for individual details. The IBM publication The Year 2000 and Two-Digit Dates - A Guide to Planning and Implementation GC28-1251-05, may help mainframe users get started on time and date related issues. An unofficial copy was available at http://ano2000.kpnqwest.pt/ibm/best_ibm.pdf in PDF format.
Leap seconds (section 3.6.4) may make some small adjustment to the actual time the time overflows.
Macintosh systems can be set to use the ISO 8601 date and time format. Please visit http://karchive.info.apple.com/article.html?orig=til&artnum=60753 for details.
Leap seconds (section 3.6.4) may make some small adjustment to the actual time the epoch overflows.
int TheTime = time( (time_t *)NULL );/* not portable! */
This works under most current UNIX systems where int and long are the same. But if the program is "fixed" by moving it to a 64-bit system, it will still fail in 2038 if int remains 31-bits while time_t (and long) grows to 63-bits.
Programs using "int" for time will immediately fail when migrated to compilers defining int's as 16 bits.
Using ints for variables that contain times makes it much more difficult to track down all statements impacted by date calculations during repairs or enhancements to their logic.
Not as many immediate portability problems as int, but this is still pasting a "kick me" sign on your back. Especially now that ANSI/ISO C 99 (C99) supports long long data types for 64-bits, leaving int and long for 32 bits. If time_t internally becomes long long programs assuming it is just long will be in trouble.
When printing time_t values on printf() type statements (or other variable argument list statements), the value should be explicitly cast to some well-known data type if a time_t variable is not expected.
printf( "time_t TheTime=%llu\n", (unsigned long long)TheTime );
The only way, in a portable manner, to obtain the interval, in seconds, between two times in the C language is to use the difftime() function.
Unknown = Time1 - Time2;
Seconds = difftime( Time1, Time2 );
For other languages study the definition of the time functions very carefully. If needed, find the corresponding function in that language.
While starting at 1970-January-01 is popular for UNIX, other operating systems are free to choose different base times. (And they do!)
Calculating struct tm years by using modulo 100 rather than subtracting 1900.
x.tm_year = Year4 % 100; /* popular, but sad, method */
Time = time( (time_t *)NULL ); AboutSixMonthsAgo = Time - ( 6 * 30 * 24 * 60 * 60 ); AboutSixDaysAgo = Time - ( 6 * 24 * 60 * 60 ); /* above may be safe for crude approximations */ Years = ( Time2 - Time1 ) / ( 365 * 24 * 60 * 60 ); /* very likely a problem, especially if you need to catch differences in years. does not allow for time zones (OK in UTC time), and does not allow for leap years, nor leap seconds. */ { /* one accurate way to calculate year offsets follows */ struct tm StructTm1, StructTm2; StructTm1 = *localtime( &Time1 ); StructTm2 = *localtime( &Time2 ); Years = StructTm2.tm_year - StructTm1.tm_year; }
"Pivot Years" allow two-digit years to be converted to four-digit years. The standard pivot year for UNIX is 68 (e.g., 68/69). 69-99 are assumed to be 1969 through 1999 while 00-68 are assumed to be 2000 through 2068.
This author prefers to provide two values for pivot years (68/69) rather than just a single number (68) to avoid confusion on exactly where the boundary is.
Mistake:
Different pivot years in different programs.
Repair:
The same pivot year should be used in all programs.
Mistake:
Pivot years of 69/70 must not be used in most UNIX or C environments.
Due to time-zone differences, the Americans, and other areas,
actually see years of 69 for small
values of time_t on 1970-January-01 where the time_t value is less than
the offset from UTC time (e.g., before 5AM on EST time).
Since these areas have a negative offset from UTC,
their early values are the day before UTC time: 1969-December-31.
Programs using 69/70 for pivot years incorrectly translate 1969 into
2069 for these early times.
Repair:
Use pivot years of 68/69.
The standard C method for tracking year values is the tm_year member of struct tm. The value of tm_year is kept in "Year-1900". Many people assume these are two-digit values that will become zero in the year 2000, but this is not true. Starting in the year 2000 the value will use the three digit value 100, year 2001 will use 101, etc.
The following is a popular perl bug. Creative programmers in other languages have migrated this bug to languages where it is difficult to express it.
@LocalTime = localtime time;#obtain local time $Year = "19" . $LocalTime[5] if $LocalTime[5] >= 70; $Year = "20" . $LocalTime[5] if $LocalTime[5] < 70;
This not only has the pivot year bug previously described (69/70), it will also yield the five-digit year "19100" for the year 2000. Even if it did work, you would get 200 through 209 for the years 2000 through 2009. The correct fix is to use:
$Year = $LocalTime[5] + 1900;
Perl is such that you can use the numerically calculated value anyplace a character string is needed.
Also see 8.2.4, Errors In Printing Years for common problems relating to printing tm_year values.
Incorrect usage of tm_year values are a major source of errors! |
When receiving a two-digit year, simply using the date without special checks is wrong.
Programs reading two-digit years need to add "100" to the year if it is too small. A well-used pivot date is 1968:
#define PIVOT_YEAR 68 /* window year (test <= this) */ int year2; /* current year as value of 00-199 */ if ( year2 <= PIVOT_YEAR ) year2 += 100;
This makes a value that may be used with values in the "tm" structure for years of 1969 through 2068. Whatever pivot date you use, be sure all programs agree on it.
More on pivot years can be found in section 10.8.
When reading four-digit dates relating to struct tm values, be sure to SUBTRACT 1900 from the date rather than taking modulo 100 (remainder) of it before setting values in "tm" structures.
Wrong.tm_year = Year4 % 100;/* VERY BAD */ Good.tm_year = Year4 - 1900;/* correct tm_year */
This code will fail in the year 2000 when it prints "19100".
void WrongDate ( struct tm *year ) { printf("The date is 19%d-%2d-%d", year->tm_year, year->tm_mon, year->tm_mday ); }
If it was just printing "The date is %d-..." it would print a year of "100-...", yet another way to go wrong.
A correct way to print two-digit tm_year values follows:
printf( "%02d/%02d/%02d\n", x.tm_mon+1, x.tm_mday, x.tm_year%100 );
Programs expecting to "correct" the previous problem by only printing the last-two digits of "tm" years must not overlook printing leading zeros for the years 2000 to 2009.
printf("The date is %2d/%2d/%2d", year->tm_mon, year->tm_mday, year->tm_year % 100 );
While numerically valid, single-digit years may not be acceptable to customers, especially if just %d is used to format the year, which will format a single digit for the years 2000 through 2009.
Using "%02d" will properly print two-digit years. ("%02d" is also suggested for month and day.)
NOTE: "19%d" would print "190" to "199" for these ten problem years.
Programs that format years themselves rather than relying on printf type formatting may produce non-numeric characters starting in the year 2000.
char TheDate[9] = "mm/dd/yy";/* formatted year */ TheDate[7] = year->tm_year % 10 + '0'; /* isolate 1s digit */ TheDate[6] = year->tm_year / 10 + '0'; /* isolate 10s digit */
Once year is greater than 99 TheDate[6] will contain non-numeric values (TheDate[6] = 100 / 10 + '0' is 10 + '0', or ':' for a year of ":0"). Windows 3.x, and some early 95/NT programs, pull this trick in places.
Year--;
This logic would happily return -1 rather than 99 when backing up from year 2000 on programs that use two-digit years (00 for 2000).
Additional information is in section 10.4 of Good Coding Techniques.
Programs that only use the first test will work for years from 1901 to 2099 (and are technically not having Y2K errors). Programs that just use the first two tests will fail in the year 2000.
The C language, and other languages using year-1900 formats, are open to the bug of dividing year-1900 values by 400 to test for leap years.
LeapSw = !(x.tm_year % 400) || ( !(x.tm_year % 4) && (x.tm_year % 100) );
This is a serious bug as you need to add 1900 to get proper results:
LeapSw = !((x.tm_year+1900) % 400) ||
Typical failure symptoms include calculating the wrong day of week on and after 2000-03-01 (Monday, Tuesday, etc.) and calculations advancing over February-29 being short by one day;
Some urban legends have a fourth test as well based on the fact that the average tropical year is about 365.242199 days (depending on which reference book you look at) while the Gregorian leap year calculations use 365.2425 (exactly) for an error of about 0.000301 days per year.
Other legends call for a double leap year in the future.
While a correction will be needed sometime in the distant future, the current rules are so accurate, and all of the earth's movements sufficiently chaotic, the actual year this correction will be needed can not be predicted with sufficient accuracy. The popular guess so far range between the year 3400 and 4300. By this time mankind may decide it's easier to adjust the earth's orbit than all the calendar calculation programs.
Making errors when calculating times that cross minute, hour, and especially day boundaries is a very popular pastime in nearly every computer language. All it takes is a programmer overlooking several obvious or subtle features of time. The more common ways of making mistakes is discussed in this section in ways that apply to all programming languages. The length of minutes, hours, and days are also problematic when crossing midnight, times spanning local savings time changes, and times involving Leap Seconds. Any error in either calculating the next (or previous) day or crossing midnight can be loosely called a "length of day" bug. Many errors in calculating lengths of minutes and hours also adversely impact length of day calculations.
Programs internally using UTC time without dates tend to avoid most of these problems. Programs that use dates or local times tend to have more problems. Problems due to Leap Seconds tend to range over all classes of programs. Programs that calculate specific ending times to a hour, minute, and second in the future should be considered guilty until proven innocent. Bugs due to improperly processing local savings time tend to be particularly obnoxious as developers can easily overlook them or, not properly understanding savings time, insist their buggy code is correct ("it works for me!"). Sometimes these bugs may lurk inside a program for years before causing problems. Bugs showing times that are an hour off tend to be easy to find. Bugs silently miscalculating times that result in spoiled output may present no clue to what caused the failure.
Attempting to change to the next day, hour, or minute by adding a constant to the current time, especially local times, is a near universal cause for these bugs. Such calculations do not always change to the expected time. Other developers simply to not realize how provincial savings time rules are and apply their local rules to the rest of the world, or even their own nation. Operating systems that do not follow leap seconds diverge from the true time when a Leap Second strikes. Operating systems that do follow leap seconds can cause problems for applications that do not.
Using UTC time can greatly reduce these problems as UTC time is not subject to savings time changes. While leap second changes remain, these occur much less often and may only span one second. More information on leap seconds, is found in section 3.6.4.
The most popular ways to go wrong include:
Adding 60 seconds to the current time should be considered "a minute from now" and not the "next minute". The same applies to hours and days.
Adding a constant needs to be considered an "approximate" advancement in time. Calculating an "interval" rather than an "ending time".
When there is a need to advance across local dates, with no real need for time, using a base of 12:00 (local noon) for calculations may help. Do not use midnight (00:00) or the current local time. This should help in avoiding errors due to savings time and leap second changes.
time_t Time, Midnight, Noon, Tomorrow; struct tm t; time( &Time );/* fetch current epoch time */ t = localtime();/* break down local time */ Midnight = /* calculate midnight in epoch time */ Time - (((t.tm_hour*60)+t.tm_min)*60+t.tm_sec); /* (ignore savings time changes) */ Noon = Midnight + 12*60*60;/* noon, epoch time */ Tomorrow += Noon + 24*60*60;/* time FALLING in tomorrow */ /* (not ALWAYS noon) */ If your use of date is sufficiently casual that you don't care if it is off once in a while, then document the fact and just use the traditional Tomorrow += UtcTime + 24*60*60; // calculate next day, ignoring ////leap seconds
Programs involving only simple makefiles, with simple targets, dependency rules, and make steps, are most unlikely to have this problem.
Exception: the underlying SCCS, RCS, or other source control programs, must be Y2K compliant for make to work. If you have any doubt at all, check any critical programs to be sure they are compliant.
If these support programs have Y2K bugs in them it may not be possible to build or distribute the application after 2000-January-01 until these bugs are repaired. The application itself may not have any Y2K bugs in it, working and testing fine after 2000-January-01. Thus people testing only applications built in the 199x time period may be missing something critical to successful Year 2000 operations.
On large or complex applications whose build tools use dates, part of the Y2K test should be advancing the local clock to various dates in year 2000 and attempting to rebuild the application in the test years and passing the program though any automatic cataloging or distribution processes. Without this test you may not be able to repair or distribute any Y2K bugs overlooked in the program until after you chase down the make bugs.
The following makefile support program has two such Y2K bugs in it. Study the program before reading the description of the problem to see if you can find these bugs.
BuildNumber.c - return build number for program
A build number is used to uniquely identify a particular build. This function encodes the build number as a magic number (50) followed by a program generated date and hour within that date.#include <stdio.h> #include <time.h> char * BuildNumber(void) { struct tm LocalTime; char static Number[10+1]; time_t Time; Time = time((time_t*)NULL); LocalTime = *localtime( &Time ); printf( "50%02d%02d%02d%02d", LocalTime.tm_year, LocalTime.tm_mon+1, LocalTime.tm_mday, LocalTime.tm_hour ); return Number; }FAILURE: This program will produce 50100010100 for 2000-01-01. Automated archive or distribution programs sorting build numbers by character values will suddenly think this is older than programs with build numbers of 5099123123 made moments before and happily redistribute the last program built in 1999 rather than the last program build in year 2000.
FAILURE: Number[11] is overflowed by one character as the year 2000 result is now 11 digits plus the terminating '\0' (null byte), 12 digits total length. This can result in segment faults, truncated strings, or other undefined behavior, that completely prevents building new programs.
FIX: Change the strftime() call to put out a three digit date that still keeps a 10-digit number.
strftime( "50%02d%02d%02d%02d", original bad line strftime( "5%03d%02d%02d%02d", repaired line
Many *roff programs simply use 19\n(yr to print the current year. This must be changed to match the good code described in section 10.2.4, Troff/Nroff Macros as it tends to format as 19100 or 190 on most versions of *roff in the year 2000. If \n(yr is used without a leading 19 it tends to format as 100 or 0 (single digit!).
Y2K bugs in the original *roff macros are legion.
The difficulty is that these dates are real dates coming soon to computers near us. Programs using these dates are going to do some very strange things when it encounters these dates for real. In practice such programs must use values that are not valid dates, such as a reserved alphabetic name, 00/00/00, or 99/99/99 (non-zero values suggested to avoid accidental blanks from triggering the condition).
Once the program is repaired to use some other magic value very thorough testing is needed using the former magic dates, testing transactions that start and end of this date, to be sure all logic that tested for these dates has been consistently repaired.
The Global Positioning System allows positions to be calculated by having satellites broadcast very accurate times of day to the GPS receivers. Once a receiver has three or more of these times it performs a bunch of fierce mathematical calculations on the subtle differences in the received times to triangulate the receivers location. Four times are required to calculate elevation. Because the time is constantly broadcast, and is so accurate, more and more systems that log transaction times, or otherwise require good times, are using special "time only" GPS receivers that computers can read time from.
Computer systems relying on GPS for time of days must allow for various idiosyncrasies of the GPS time system. In short the time used by Global Position Systems is slowly drifting from UTC time. During year 2000 GPS system time was 13 seconds ahead of UTC time. This difference will change with each leap second adjustment applied to UTC. Any computers obtaining time of day from GPS receivers must provide the appropriate corrections to obtain a proper UTC time. The GPS receiver must also properly handle GPS End Of Week Rollover conditions where the internal cycle used by GPS resets to zero.
For those who want it, a longer and more technical description follows.
GPS receivers use a 1,024 week (7,168 day, slightly over 19.5 year) cycle where the first cycle started on 1980-January-06 00:00:00 UTC. GPS System Time started when UTC time was 19 seconds ahead of TAI time. International Atomic Time (TAI) started in 1958 and keeps pure atomic time without any "perceivable step adjustments" (e.g., leap second corrections). To simplify internal GPS operations GPS time also omits leap second corrections. Therefore GPS system time will slowly drift from UTC time but always remain 13 seconds ahead of TAI time. During the years 1999 and 2000 UTC was 32 seconds ahead of TAI time. To calcualte UTC time from GPS System Time the leap second correction in force at the moment must be applied.
On 1999-August-22 at 23:59:47 UTC (24:00:00 GPS system time) the original GPS cycle reached its end and returned to week zero. Non-compliant receivers produced incorrect position reports as they reverted to the original start date of 1980-January-06. This condition is known as End Of Week (or EOW) or End Of Week Rollover in the GPS community. This problem tended to be limited to older receivers manufactured before 1995. Most modern receivers successfully coped with this roll-over, though some needed a software update to do so. Receivers impacted by this problem that did not accept firmware upgrades had to be discarded.
The next GPS roll over occurs on the weekend of 2019-April-07 Sun. Visit http://www.navcen.uscg.gov/g ps/geninfo/y2k/ for more information from the United States Coast Guard web site on GPS.
Windows 95, 98, and NT 4,? will delay a week in the United States (at least) for returning to daylight-saving time on years where 01-April falls on a Sunday. Instead of returning to DST on April-01 the MSVCRT.DLL may return to DST on 2001-April-08. The problem first occurred in the year 2001. Once the DLL is replaced with a corrected version some applications may need to be recompiled. The problem appears to be fixed in Windows 2000.
Article Q214661, FIX: Daylight Savings Time Bug in C Run-Time Library" in Microsoft's KnowledgeBase discusses this problem, and provides links to where programmers can get Service Pack 3 or later for Visual Studio 6.0. A fix for regular users is available via an older standard Windows Update that everyone should have installed by now.
Shell scripts often favor obtaining dates using `date +%y` rather than `date +%Y`.
Month=`date +%m`; Day=`date +%d`; Year2=`date +%y`;
If this logic happened to execute on different days you could get very strange results. The correct logic is to use the read statement to isolate all fields from a single date execution:
date '+%m %d %Y %H %M %S' | read Month Day Year4 HH MM SS -- or -- eval `date "+Month=%m; Day=%d; Year4=%Y; HH=%H; MM=%M; SS=%S"`
LastYear=`expr \( $Year2 - 1 \)`; if [ $LastYear -lt 0 ]; then $LastYear=99; fi LogFile="log/MyLog.$LastYear$Month$Day";#2001-2010 bug
NextYear=`expr \( $Year2 + 1 \)`; LogFile="log/MyLog.$NextYear$Month$Day";#2001+2009 bug
Note that this is not a true year-1900 value, but simply a wrong value.
Also see http://www.merlyn.demon.co.uk/js-dates.htm#SDB.
JavaScript programs that do not convert years of 0 to 99 to 1900 to 1999 will fail at the stroke of a particular midnight. Code to cope with this bug follows:
YearThis = now.getYear();//get the year if ( YearThis < 500 )//allow for 1900-2399 to return 0-499. YearThis += 1900;//
In fear that some (yet to be discovered) implementations of JavaScript will always return the years as year-1900, JavaScript programs written by this author break on the year 500 rather than the year 100. It would be easy for a JavaScript implemented in C or C++ to do this: simply returning the native year in "tm" structures will do it very nicely. This mistake is frighteningly easy to make.
An interesting result of this particular is that most year 2000 count down programs in JavaScript are not year 2000 compliant. Many always expect the year to be just like it is in C: values of year-1900 with the year 2000 being 100 rather than the 2000 it will be. During the early days of the year 2000, if you see Y2K pages claiming there are "1900 years left until the year 2000", you will know what happened.
Java itself appears to have implemented getYear() correctly, consistently returning four-digit years.
[Y2K]Some systems do not allow for leap seconds, including values in time_t, while some systems do. Consult your local documentation. If your environment does not support leap seconds, allowing for future releases of the OS to support leap seconds is wise.
Some systems improperly calculate values of time_t. Thus values of time_t may not be portable across all systems or compilers. This is generally only a problem if the local operating system does not track time in time_t units making it necessary for the language library functions to do the translation. Differences to thousands of seconds have been observed by this author.
Naturally operating systems can have most of the other time bugs in this document.
When testing date logic of programs, especially for testing Year 2000 related repairs, it is common practice to change the CPU's clock to test multiple dates. The best dates to test are those the application has a chance of breaking under. Traditional dates include 1999-12-31, 2000-01-01, 2000-02-30, 2000-03-01, and a date in the year 2010.
Instead of changing the CPU's clock, sometimes you can run special applications that allow you to leave the CPU's actual time alone, while making the application under test believe it is running in a different time rather than the current true time. The best of these applications allow testers to control the speed of the clock, faster or slower than real time. Another sign of a good application is one that adjusts the timestamp of files, etc., so the system sees the current true time but the application sees the advanced time. This reduces the need to artificially age file and other timestamps.
Changing the environment variable TZ, on systems supporting TZ, has proven itself insufficient for serious testing. Too much code is happy to ignore TZ. In C, TZ does not impact values in time_t variables, just local times.
Changing the date applications run under is not without it's own hazards. Before advancing the application's time for testing, be sure nothing is present to rise up and bite you. Some of these issues include, but are not limited to,
Backup everything in sight to be sure you can recover from any data loss that may strike as a result of the tests.
The ideal test bed is a sacrificial system that is a clone of your production system, with full backups beforehand and a total reinstall later. Best of all, though hardest to set up, is a system isolated from your production networks. Sadly this is not practical for many people.
It may be necessary to artificially age dates in databases to match the time you are running the test in. Such data aging may best be done after the date is changed but before the tests are run.
If testing multiple future dates it is best to cycle through the test dates in ascending order.
After all of your test runs are complete, reboot the system after returning to the current time if you changed the CPU's actual time. Most operating systems take critical actions at regular intervals. Once the clock is set into the future the system may stop performing critical tasks until the clock advances to the older time. Even if the system appears to work fine there may be problems brewing out of sight.
Computer operators often see such freezes when they correct the CPU's clock by 10 seconds or so if the clock becomes adrift of true time. If you just backed up the CPU time a year or more it is a long wait without the reboot. Flushing the disk cache, updating graphic widow displays, batch jobs running at scheduled intervals (e.g., UNIX cron jobs), the list of potential problem areas is very long,.
After you run with the local time set to the future you may wind up with lots of files date-touched, with these dates. While simply rebooting the system with the correct date and time allows the operating system to recover (usually!), applications depending on file time stamps or dates written within files and databases may become confused by the future dates. It is often easier to just scrub the system and reinstall once all tests are complete than to track down the future dates and revert them.
Always use the native data type for the language to hold date and time values. For C time_t holds numeric values suitable for comparing and calculating many times. (NOTE: some systems are starting to use 64-bit time_t values rather than the currently popular 32-bit time_t values).
Always use any built-in translation functions to map between different types of time data.
The *roff family of programs traditionally maintain two independent symbol tables... one for "numeric registers" and another for "string registers" and macro definitions. The same symbol name may be used for a string or macro without conflicting with an existing numeric register. The date solutions proposed herein define new string register names that match, or are close to, the original numeric register names used for dates.
It is assumed that the serious reader knows the basic rules of *roff. *roff programs very strange rules to define and reference symbols not discussed herein. Two rules to be particularly aware of here is that the names are case sensitive and traditionally limited to two (2) characters in length.
As the native yr numeric register may contain year%100 or year-100 style years, the following code sets the (new) numeric register Yr to a full four-digit year using the current year found in the built-in numeric register yr.
.nr Yr \n(yr\" set to "short" year .if \n(Yr<69 .nr \n(Yr+100\" apply official UNIX pivot year .if \n(Yr<500 .nr \n(Yr+1900\" ensure full four-digits
See section 10.8.2, Fixed Pivot Years, for a description of this logic.
Two-digit years should not continue to be stored in numeric registers as that will format years as a single digit number for the years 2000 through 2009. (Any testing of *roff pages should include formatting both years 200x and 20xx to check the switch between one and two digits.)
The following logic converts the numeric date registers into appropriate two-digit strings of the same name
.nr Yr \n(yr%100\" ensure two-digit numeric register .ds Yr \n(Yr\" convert to string .if \n(Yr<10 .ds Yr 0\n(Yr\" 0-9 becomes 00-09 .ds mo \n(mo .if \n(mo<10 .ds mo 0\n(mo .ds dy \n(dy .if \n(dy<10 .ds dy 0\n(dy
Use these string values to obtain two-digit date fields.
To obtain one-digit month and days, with leading spaces, use:
.if \n(mo<10 .ds mo \0\n(mo .if \n(dy<10 .ds dy \0\n(dy
Omit these "if" statements if you want one-digit month and days for values of 1 through 9. The current trend appears to always format years using two-digits.
To obtain dates in ISO 8601 format, combine the previous Yr logic of four-digit years and the mo and dy logic of two-digit dates with the following statement:
.ds dt \*(Yr-\*(mo-\*(dy\"ISO 8601 date
Testing the date would use something like the following:
.br Today's Date is \*(dt .br
ftime() returns current time, with a potential resolution of up to one-thousandth of a second (millisecond), along with current time zone and daylight saving time information.
The proper way to determine if a year is or is not a leap year requires three different tests:
A correct C expression for determining if a year is a leap year or not follows:
LeapSw = !(Year % 4) && ( (Year % 100) || !(Year % 400) );
C Suggestion: Many UNIX systems provide a header file named tzfile.h that provides many definitions useful to time processing. If your system does not provide a tzfile.h you can download a compressed file from:
http://sunsite.doc.ic.ac.uk/public/pub/public/unix/4.3bsd-reno/include/
or create your own with just the following required definitions, changing any values as appropriate to your system:
/* tzfile.h -- local subset of regular tzfile.h */ /* determine if leap year: (valid for current Gregorian Calendar) */ #define isleap(Year) ( \ !((Year) % 4) && (((Year) % 100) || !((Year) % 400)) \ ) #define EPOCH_YEAR1970 /* end: tzfile.h */
Ordinal Dates, popularly but incorrectly called Julian Dates (or Julian Calendar), provide a year and the number of days into that year. 2000-January-01 is 2000-001. 1999-December-31 is 1999-365.
To translate between standard Gregorian Dates and Ordinal Dates the following code fragment can provide a helpful model:
short OrdinalDays[] = { /* days (0-364 or 0-365) BEFORE each month */ 0,31,59,90,120,151,181,212,243,273,304,334,365, 0,31,60,91,121,152,182,213,244,274,305,335,366 }; short LeapSw;/* 0 if not a leap year, 13 if leap year */ LeapSw = ( Year%4 || (!(Year%100) && Year%400)) ? 0 : 13;
To translate a Gregorian Month (1-12) and Day (1-31), into an Ordinal day number (0-364 or 0-365) for a given Year:
Ordinal = OrdinalDays[Month-1+LeapSw] + Day-1;
Note how LeapSw directs the array references to the first or last half of OrdinalDays where the first half is for normal years and the second half is for leap years.
To determine the number of days in the current Year:
DaysInYear = OrdinalDays[12+LeapSw];
Note that the OrdinalDays table has 13 month values in it for each type of year just to allow this subscript to work.
To determine the number of days in the current Gregorian Month:
DaysInMonth = OrdinalDays[Month+LeapSw] - OrdinalDays[Month-1+LeapSw];
Chronological Julian Days follow Julian Days but with a half-day offset to follow the current Gregorian convention of starting a new day at midnight.
Modified Julian Days start counting at midnight 1858-November-17 and include a 0.5 day offset from traditional Julian Days to align them the current convention of starting a day at midnight.
Designing logic for date intensive programs may be much easier if the dates are stored and manipulated using some form of Chronological Julian Days. Problems associated with Y2K, 2038, leap years, and many others, all go away with Julian Days.
It is surprisingly easy to convert between the two if you keep to modern dates ignoring historic calendar corrections and other trivia. The following C code provides such a simple-minded example.
/* JULIAN_OFFSET: Julian day for either year 0 (1BC) * if normal Julian Days are wanted, or offset needed * to make 1858-11-17 day zero for Modified Julian Days. * Pick the appropriate definition for your use. */ #define JULIAN_OFFSET 1721059 /* use full Julian Days */ #define JULIAN_OFFSET (1721059-2400000) /* use MJD */ typedef long jday; jday Tm2Julian( struct tm *Tm )/* incoming year */ /* (valid year range 1752 through far future) */ { jday JulDay; int Year;/* full four-digit year */ int Year1;/* Year, less 1 */ Year = Tm->tm_year + 1900; /* calculate 4-digit yr */ Year1 = Year - 1; /* year, less 1 */ JulDay = Year * 365; /* approximate day cnt */ JulDay += (Year+3) / 4; /* add in leap years */ JulDay -= Year1 / 100; /* fix for 100 years */ JulDay += Year1 / 400; /* fix for 400 years */ JulDay += Tm->tm_jday; /* add in ordinal day */ JulDay += JULIAN_OFFSET; /* final adjustment */ return JulDay; /* return appropriate answer */ }
A perl subroutine library is available at http://www.exit109.com/~ghealton/y2k/julians.pl that converts between Julian Days, Ordinal Dates, and standard Gregorian calendar dates.
Picking a non-traditional epoch to start counting days is a popular way to obtain smaller day number for current dates.
A simplified version of Julian Days, that this author calls "integer days", allows up to 176 years to be packed into a 16-bit unsigned value. Encoding and decoding the actual year, month, and day is very direct and simple. This speed is at the expense of not being able to use these dates in calendar calculations. Subtracting two dates does not tell you the number of days between the two dates, though it does tell you the general relation between two dates.
The algorithm stores something close to the number of days that have occurred between the encoded date and a previously selected epoch (no dates may be before the epoch, even if IntDate is signed). The encoding is done under the assumption that each month has 31 days in it, resulting in a "year" of 372 days. To fit 365 days in a year this technique wastes 7 days a year (6 days during leap years).
#include <limits.h> /* get #define CHAR_BIT 8 */ typedef unsigned short IntDay;/* our data type */ #ifndef INT_DAY_EPOCH #define INT_DAY_EPOCH 1970/* base year */ /* suggest close to 1970 if IntDay is unsigned short. suggest 0 if IntDay is unsigned > short suggest (-4713+1) if IntDay is signed long */ #endif #define INT_DAY_DAYS (31*12)/* maximum days in year */ #define INT_DAY_YMAX /* maximum years */ \ (((1U<<(sizeof(IntDay)*CHAR_BIT-1))/INT_DAY_DAYS)*2) /* determine maximum value IntDay may hold, * using power of two, via shift. Use CHAR_BIT-1 * to only use one-half of the total value * to ensure we do NOT overflow and get a value * of 0 by shifting the bit all the way out. * Make up for this by the final *2, which * doubles the previously halved value. */ /* Sample code to step through each year follows: */ /* for ( n = 0; n < INT_DAY_YMAX; ...) {*/ /* int year = n + INT_DAY_EOPCH; */ /* extract Gregorian year from an IntDay value */ #define IntDayYear(Day) \ ( (Day) / ( 31 * 12 ) + INT_DAY_EPOCH ) /* IntDayEncode() - Gregorian Date To IntDay */ IntDay IntDayEncode( /* returns compacted date */ int Year, /* year to encode (e.g., 1998) */ int Month, /* month to encode (1 thru 12) */ int Day ) /* day to encode (1 thru 31) */ {{ return (IntDay)(( (IntDay)( Year - INT_DAY_EPOCH ) * (IntDay)12 + Month-1 ) * 31 - 1 + Day ); }} /* IntDayDecode() - IntDay to Gregorian Date */ void IntDayDecode(/* has no value on return */ IntDay Encoded, /* compacted date to expand */ int *Year, /* not NULL: ptr to store year at */ int *Month, /* not NULL: ptr to store month at*/ int *Day ) /* not NULL: ptr to store day at */ {{ if ( Day ) *Day = Encoded % 31 + 1; Encoded /= 31; if ( Month ) *Month = Encoded % 12 + 1; Encoded /= 12; if ( Year ) *Year = Encoded + INT_DAY_EPOCH; return; }}Other strange ways of packing dates also exist, such as the one known as GYMD as found at http://www.gtbaddow4.freeserve.co.uk/.
A pivot year, also known as date windowing, takes a two-digit year and expands it to determine which century the year is in. Typically the year is converted to either a full four-digit year or into the year-1900 format, as appropriate to the application at hand.
The following types of pivot years exist:
Advantages: easy to code and debug. Reading static dates in databases are always consistent.
Disadvantages: limited life span of logic. Life span can be increased by reading in pivot year at run time from a fixed location to ensure it can change in time or if the program is moved to a different environment with a different pivot year.
Advantages: longer life span for logic.
Disadvantages: not suitable for all applications. Harder to code and debug.
Advantages: longest life span for logic.
Disadvantages: not suitable for all applications. Hardest to code and debug. Requires more testing.
The following logic accepts as input either pure two-digit years, year-1900 values, or full four-digit years. The result is four-digit years.
#define YEAR_PIVOT 68/* official UNIX pivot */ #define YEAR_BREAK 500/* value < AnyYearWeUse && value > AnyYearWeUse-1900 */ int FixYear( int YearWork ) { if ( YearWork <= YEAR_PIVOT ) /* 21'st century? */ YearWork += 100; /* yes: pivot year */ if ( YearWork < YEAR_BREAK ) /* four-digit year? */ YearWork += 1900; /* no: make four-digit */ return( YearWork ); /* return 4-digit year */ }
As time functions, no matter what the language, are generally system calls they often have more overhead to them than calls to simple conventional functions. It is often best to obtain the time at the start of a transaction (e.g., reading a record, opening a connection, starting some request) and using that time throughout the transaction. Especially if the transactions are short but frequent.
Remembering a single start time ensures all timestamps in different messages are synchronized with each other. Alternate time formats can be saved by calling functions like C's localtime() function, to convert the system time to other formats at the same time the system time is fetched.
Longer transactions, or time stamping log files or other places where exact time is critical can call for more frequent use of time functions. Fetching the system time within heavily executed inner loops should be avoided unless you truly need the time with such frequency.
Programs creating files may wish to use something like C's utime() function to set the timestamp of a file to exactly match a time appearing in an important log file to better assure users they have the correct file when looking at problems.
Section 8.10, Shell Script Bugs, also describes date related bugs found in shell scripts.
If you received dates that may have incorrect century digits in them due to Y2K bugs in the generating program, the following logic may prove useful:
Year2 = UntrustedYear % 100;/* discard century digits */ Year4 = FixYear( Year2 );/* rebuild century digits */
This code divides the incoming year by 100. The remainder, a two-digit year, is then adjusted to become an appropriate four-digit year.
When coding logic that needs to wait a specific elapsed time, check to see if your local system supports a feature generally called "interval timers". These can be used, often with great precision, to wait a specific amount of time. Local sleep() functions often use interval times.
Historically program delays have been performed by reading the current time, adding the desired delay to it, then waiting for that "wall clock time" to occur. This works fine proving the operator does not change the time of the system to correct for the system becoming adrift of true time at a critical moment. Testing date logic can also result in date changes.
See section 9.2, Hazards Of Changing The Time, for additional of problems associated with using wall clock times. In many cases you will be stuck with wall clock times, regardless of the problems they cause, but don't use it out of reflex when interval timers are reasonable alternatives.
If you determine the extra effort is warranted to cope with occasional time of day corrections from operators, the following code presents a model you can expand on.
/* code to wait until the file whose path is in FileName * is created. */ #define DELAY 45/* delay before timeout */ int DelayLeft; /* failsafe time countdown */ time_t TimeLast;/* time loop expires */ time_t DelayLast;/* last observed time */ DelayLeft = DELAY;/* set failsafe counter */ DelayLast = (time_t)0; /* and set associated time */ TimeLast = time((time_t*)NULL) + DELAY; /* expire time */ while( time((time_t *)NULL) <= TimeLast ) { /* keep waiting for the file to appear */ if ( stat( FileName, &StatBuff ) == 0 ) { /* file exists */ break;/* done! */ } /* (the main concept this section is demonstrating is in the following if... */ if ( DelayLast != time((time_t *)NULL) ) { /* in a new second: failsafe test to ensure * **we never loop to long if system clock * **moves backwards on us */ if ( DelayLeft-- < 0 )/* count down failsafe */ break;/* expired: stop */ time( &DelayLast );/* remember new time */ } sleep(1);/* sleep a one second interval */ /* NOTE: signals, and other events, may result in * **the sleep returning early. It may also sleep * **noticeably longer than 1 second on busy systems. * **In some environments simply counting sleep(1) * **calls will not work, at least without a * **lot of additional code. */ }
The sleep, while it may use interval timers, does not make the outer loop immune to time changes. Without the special test that watches for, and counts down, new seconds the system would suffer delays if the system time went backwards in the middle of the loop. Note that each backwards time change may result in one-second less of a wait before timeout.
Advancing the time in a forward direction may initially make the outer loop "timeout" in the for statement. If the time interval is small, such as when the local CPU time is corrected, the application should not have trouble if the timeout period is sufficiently large.
http://www.exit109.com/~ghealton/.dates.html (dot dates dot html)
There are a number of Y2K links scattered throughout this document.
Murisier Serge
(1999-09)s.murisiercross-systems.com Comments about Microsoft Epoch dates. "José Carlos Fernández Gutiérrez
(2000-02)emejcfg
madrid.es.eu.ericsson.seReported problem in automatic formatting of date in title. Valerie Kramer
(2000-11)funzoneharborside.com Correction to "Odd Day". Jerome Fine
(2000-11)jhfineidirect.com Lots of information about the tropical year and Leap Second drift. Including the difficulty in projecting future drift. Hopefully Version 2 of the document will have more of these comments merged into it. Gordon Speer
(2000-02)speeressex1.com Also caught my "Odd Day" mistake. First person to report Latitude / Longitude error. Ian Galpin
(2002-01)g1smdamsat.org Editor for the ISO 8601 Standard section of the Open Directory Project (ODP). Many changes throughout the document, especially in areas concerning ISO 8601. Highlights of the corrections include:
- Leap seconds added at same time around the globe.
- Latitude / Longitude error spotted.
- ISO stands for International Organization for Standardization.
- Clarifying "T" versus space in ISO 8601 dates and times.
- 4.2.3: corrections to reduction / truncation table.
- Week Number corrections.
- Roman Numeral corrections.
- GPS updates
- Whitaker's Almanac reference.
Ed Davies
(2002-01)edavies
nildram.co.ukReported problem in ISO timezone offsets along with several general typos. Dr. John Stockton
(2002-01)
(2002-07)jrs
merlyn.demon.co.uk
- Suggested inclusion of the mess they made of the early Julian calendar.
- Reworded "shortest year" descriptions at his suggestion.
- EU uses uniform savings time plan.
- Astronomical dates added by request.
- Provided fodder for 20/02/2002 in Odd Day.
- Technical notes, and some corrections, about computer time keeping.
- Additional information about GPS.
- Time test hazards: described DOS timer at $40:$6C above $1800AF problem.
- Julian! Julian!, Who Are Thou Julian? added by request.
- "Some Transport organisation(s) use(s) a day from 03:00 to 27:00" demoted to rumor unless additional information is found.
David R Tribble
(2002-02)davidtribble.com
- Descriptions of ways different operating systems track time of day. See http://david.tribble.com/text/c0xtime.htm#prior.art.
- The US military one-letter timezone code also omits "W".
- Assorted technical and general corrections.
Thomas Scheidegger
(2002-03)tscheideswissonline.ch
- Notes on Windows NT time.
- Informed me of Microsoft .NET date formats. http://msdn.microsoft.com/library/en-us/cpref/html/frlrfSystemDateTimeClassTopic.asp
- Infomed me of the most wonderful site at http://metric1.org/.
J. S. Connell
(2002-09)ankhcanuck.gen.nz
- Additional information on Microsoft 100ms ticks.
- Information on 1904-01-02 Macintosh Excel epcoch.
- Assorted corrections.
Troy Goodson
(2002-12)Troy_Goodsoniname.com Use proper "Coordinated Universal Time", not "Universal Coordinated Time", as some major sites have. Dan Kohn
(2003-02)dandankohn.com RFC-3339 brought to my attention Daniel Biddle
(2003-07)deltabosian.net
- Missing comma in RFC-822 date and time format.
- Some techcial details about how sendmail writes dates not following RFC-822.
- Spotted used "jjj" for some ordinal dates... now using "ddd".
- tzinfo supports both a posix zone to stay POSIX compliant (and ignore leap seconds), and right, which observes leap seconds.
NOTE: the strange use of the image in the previous E-mail addresses is to hide the addresses of these kind people from being harvested by evil E-mail robots that bomb any E-mail address they find with spam.
Hits since 2002-01-21:
root@127.0.0.1 -- traps evil E-mail harvesting robots? root@LocalHost postmaster@LocalHost webmaster@LocalHost