PostgreSQL 8.1.23 Documentation | ||||
---|---|---|---|---|
Prev | Fast Backward | Fast Forward | Next |
PostgreSQL uses an internal heuristic parser for all date/time input support. Dates and times are input as strings, and are broken up into distinct fields with a preliminary determination of what kind of information may be in the field. Each field is interpreted and either assigned a numeric value, ignored, or rejected. The parser contains internal lookup tables for all textual fields, including months, days of the week, and time zones.
This appendix includes information on the content of these lookup tables and describes the steps used by the parser to decode dates and times.
The date/time type inputs are all decoded using the following procedure.
Break the input string into tokens and categorize each token as a string, time, time zone, or number.
If the numeric token contains a colon (:), this is a time string. Include all subsequent digits and colons.
If the numeric token contains a dash (-), slash (/), or two or more dots (.), this is a date string which may have a text month.
If the token is numeric only, then it is either a single field or an ISO 8601 concatenated date (e.g., 19990113 for January 13, 1999) or time (e.g., 141516 for 14:15:16).
If the token starts with a plus (+) or minus (-), then it is either a time zone or a special field.
If the token is a text string, match up with possible strings.
Do a binary-search table lookup for the token as either a special string (e.g., today), day (e.g., Thursday), month (e.g., January), or noise word (e.g., at, on).
Set field values and bit mask for fields. For example, set year, month, day for today, and additionally hour, minute, second for now.
If not found, do a similar binary-search table lookup to match the token with a time zone.
If still not found, throw an error.
When the token is a number or number field:
If there are eight or six digits, and if no other date fields have been previously read, then interpret as a "concatenated date" (e.g., 19990118 or 990118). The interpretation is YYYYMMDD or YYMMDD.
If the token is three digits and a year has already been read, then interpret as day of year.
If four or six digits and a year has already been read, then interpret as a time (HHMM or HHMMSS).
If three or more digits and no date fields have yet been found, interpret as a year (this forces yy-mm-dd ordering of the remaining date fields).
Otherwise the date field ordering is assumed to follow the DateStyle setting: mm-dd-yy, dd-mm-yy, or yy-mm-dd. Throw an error if a month or day field is found to be out of range.
If BC has been specified, negate the year and add one for internal storage. (There is no year zero in the Gregorian calendar, so numerically 1 BC becomes year zero.)
If BC was not specified, and if the year field was two digits in length, then adjust the year to four digits. If the field is less than 70, then add 2000, otherwise add 1900.
Tip: Gregorian years AD 1-99 may be entered by using 4 digits with leading zeros (e.g., 0099 is AD 99). Previous versions of PostgreSQL accepted years with three digits and with single digits, but as of version 7.0 the rules have been tightened up to reduce the possibility of ambiguity.