Theory and pragmatics of the tz code and data

Outline

Scope of the tz database

The tz database attempts to record the history and predicted future of all computer-based clocks that track civil time. To represent this data, the world is partitioned into regions whose clocks all agree about timestamps that occur after the somewhat-arbitrary cutoff point of the POSIX Epoch (1970-01-01 00:00:00 UTC). For each such region, the database records all known clock transitions, and labels the region with a notable location. Although 1970 is a somewhat-arbitrary cutoff, there are significant challenges to moving the cutoff earlier even by a decade or two, due to the wide variety of local practices before computer timekeeping became prevalent.

Clock transitions before 1970 are recorded for each such location, because most systems support timestamps before 1970 and could misbehave if data entries were omitted for pre-1970 transitions. However, the database is not designed for and does not suffice for applications requiring accurate handling of all past times everywhere, as it would take far too much effort and guesswork to record all details of pre-1970 civil timekeeping. Athough some information outside the scope of the database is collected in a file backzone that is distributed along with the database proper, this file is less reliable and does not necessarily follow database guidelines.

As described below, reference source code for using the tz database is also available. The tz code is upwards compatible with POSIX, an international standard for UNIX-like systems. As of this writing, the current edition of POSIX is: The Open Group Base Specifications Issue 7, IEEE Std 1003.1-2008, 2016 Edition.

Names of time zone rules

Each of the database's time zone rules has a unique name. Inexperienced users are not expected to select these names unaided. Distributors should provide documentation and/or a simple selection interface that explains the names; for one example, see the 'tzselect' program in the tz code. The Unicode Common Locale Data Repository contains data that may be useful for other selection interfaces.

The time zone rule naming conventions attempt to strike a balance among the following goals:

Names normally have the form AREA/LOCATION, where AREA is the name of a continent or ocean, and LOCATION is the name of a specific location within that region. North and South America share the same area, 'America'. Typical names are 'Africa/Cairo', 'America/New_York', and 'Pacific/Honolulu'.

Here are the general rules used for choosing location names, in decreasing order of importance:

The file 'zone1970.tab' lists geographical locations used to name time zone rules. It is intended to be an exhaustive list of names for geographic regions as described above; this is a subset of the names in the data. Although a 'zone1970.tab' location's longitude corresponds to its LMT offset with one hour for every 15° east longitude, this relationship is not exact.

Older versions of this package used a different naming scheme, and these older names are still supported. See the file 'backward' for most of these older names (e.g., 'US/Eastern' instead of 'America/New_York'). The other old-fashioned names still supported are 'WET', 'CET', 'MET', and 'EET' (see the file 'europe').

Older versions of this package defined legacy names that are incompatible with the first rule of location names, but which are still supported. These legacy names are mostly defined in the file 'etcetera'. Also, the file 'backward' defines the legacy names 'GMT0', 'GMT-0' and 'GMT+0', and the file 'northamerica' defines the legacy names 'EST5EDT', 'CST6CDT', 'MST7MDT', and 'PST8PDT'.

Excluding 'backward' should not affect the other data. If 'backward' is excluded, excluding 'etcetera' should not affect the remaining data.

Time zone abbreviations

When this package is installed, it generates time zone abbreviations like 'EST' to be compatible with human tradition and POSIX. Here are the general rules used for choosing time zone abbreviations, in decreasing order of importance:

Application writers should note that these abbreviations are ambiguous in practice: e.g., 'CST' means one thing in China and something else in North America, and 'IST' can refer to time in India, Ireland or Israel. To avoid ambiguity, use numeric UT offsets like '-0600' instead of time zone abbreviations like 'CST'.

Accuracy of the tz database

The tz database is not authoritative, and it surely has errors. Corrections are welcome and encouraged; see the file CONTRIBUTING. Users requiring authoritative data should consult national standards bodies and the references cited in the database's comments.

Errors in the tz database arise from many sources:

In short, many, perhaps most, of the tz database's pre-1970 and future timestamps are either wrong or misleading. Any attempt to pass the tz database off as the definition of time should be unacceptable to anybody who cares about the facts. In particular, the tz database's LMT offsets should not be considered meaningful, and should not prompt creation of zones merely because two locations differ in LMT or transitioned to standard time at different dates.

Time and date functions

The tz code contains time and date functions that are upwards compatible with those of POSIX.

POSIX has the following properties and limitations.

These are the extensions that have been made to the POSIX functions:

Points of interest to folks with other systems:

The functions that are conditionally compiled if STD_INSPIRED is defined should, at this point, be looked on primarily as food for thought. They are not in any sense "standard compatible" – some are not, in fact, specified in any standard. They do, however, represent responses of various authors to standardization proposals.

Other time conversion proposals, in particular the one developed by folks at Hewlett Packard, offer a wider selection of functions that provide capabilities beyond those provided here. The absence of such functions from this package is not meant to discourage the development, standardization, or use of such functions. Rather, their absence reflects the decision to make this package contain valid extensions to POSIX, to ensure its broad acceptability. If more powerful time conversion functions can be standardized, so much the better.

Interface stability

The tz code and data supply the following interfaces:

Interface changes in a release attempt to preserve compatibility with recent releases. For example, tz data files typically do not rely on recently-added zic features, so that users can run older zic versions to process newer data files. Sources for time zone and daylight saving time data describes how releases are tagged and distributed.

Interfaces not listed above are less stable. For example, users should not rely on particular UT offsets or abbreviations for timestamps, as data entries are often based on guesswork and these guesses may be corrected or improved.

Calendrical issues

Calendrical issues are a bit out of scope for a time zone database, but they indicate the sort of problems that we would run into if we extended the time zone database further into the past. An excellent resource in this area is Nachum Dershowitz and Edward M. Reingold, Calendrical Calculations: Third Edition, Cambridge University Press (2008). Other information and sources are given in the file 'calendars' in the tz distribution. They sometimes disagree.

Time and time zones on other planets

Some people's work schedules use Mars time. Jet Propulsion Laboratory (JPL) coordinators have kept Mars time on and off at least since 1997 for the Mars Pathfinder mission. Some of their family members have also adapted to Mars time. Dozens of special Mars watches were built for JPL workers who kept Mars time during the Mars Exploration Rovers mission (2004). These timepieces look like normal Seikos and Citizens but use Mars seconds rather than terrestrial seconds.

A Mars solar day is called a "sol" and has a mean period equal to about 24 hours 39 minutes 35.244 seconds in terrestrial time. It is divided into a conventional 24-hour clock, so each Mars second equals about 1.02749125 terrestrial seconds.

The prime meridian of Mars goes through the center of the crater Airy-0, named in honor of the British astronomer who built the Greenwich telescope that defines Earth's prime meridian. Mean solar time on the Mars prime meridian is called Mars Coordinated Time (MTC).

Each landed mission on Mars has adopted a different reference for solar time keeping, so there is no real standard for Mars time zones. For example, the Mars Exploration Rover project (2004) defined two time zones "Local Solar Time A" and "Local Solar Time B" for its two missions, each zone designed so that its time equals local true solar time at approximately the middle of the nominal mission. Such a "time zone" is not particularly suited for any application other than the mission itself.

Many calendars have been proposed for Mars, but none have achieved wide acceptance. Astronomers often use Mars Sol Date (MSD) which is a sequential count of Mars solar days elapsed since about 1873-12-29 12:00 GMT.

In our solar system, Mars is the planet with time and calendar most like Earth's. On other planets, Sun-based time and calendars would work quite differently. For example, although Mercury's sidereal rotation period is 58.646 Earth days, Mercury revolves around the Sun so rapidly that an observer on Mercury's equator would see a sunrise only every 175.97 Earth days, i.e., a Mercury year is 0.5 of a Mercury day. Venus is more complicated, partly because its rotation is slightly retrograde: its year is 1.92 of its days. Gas giants like Jupiter are trickier still, as their polar and equatorial regions rotate at different rates, so that the length of a day depends on latitude. This effect is most pronounced on Neptune, where the day is about 12 hours at the poles and 18 hours at the equator.

Although the tz database does not support time on other planets, it is documented here in the hopes that support will be added eventually.

Sources: