From: | Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATH] Correct negative/zero year in to_date/to_timestamp |
Date: | 2016-02-23 01:54:05 |
Message-ID: | CAKOSWNmBLw1A5K_MtKoDv5p6jQTKtRZbEiMDJVnC=mfke_4nXA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2/22/16, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Tue, Feb 23, 2016 at 11:58 AM, Vitaly Burovoy
> <vitaly(dot)burovoy(at)gmail(dot)com> wrote:
>> Hello, Hackers!
>>
>> I'm writing another patch and while I was trying to cover corner cases
>> I found that to_date and to_timestamp work wrong if year in input
>> value is zero or negative:
>>
>> postgres=# SELECT
>> postgres-# y || '-06-01' as src
>> postgres-# ,CASE WHEN y>0 THEN ('00'||y||'-06-01') WHEN y<0 THEN
>> ('00'||(-y)||'-06-01 BC') END::date
>> postgres-# ,to_date(y || '-06-01', 'YYYY-MM-DD')
>> postgres-# ,to_timestamp(y || '-06-01', 'YYYY-MM-DD')
>> postgres-# FROM (VALUES(2),(1),(0),(-1),(-2))t(y);
>> src | date | to_date | to_timestamp
>> ----------+---------------+---------------+---------------------------
>> 2-06-01 | 0002-06-01 | 0002-06-01 | 0002-06-01 00:00:00+00
>> 1-06-01 | 0001-06-01 | 0001-06-01 | 0001-06-01 00:00:00+00
>> 0-06-01 | | 0001-06-01 BC | 0001-06-01 00:00:00+00 BC
>> -1-06-01 | 0001-06-01 BC | 0002-06-01 BC | 0002-06-01 00:00:00+00 BC
>> -2-06-01 | 0002-06-01 BC | 0003-06-01 BC | 0003-06-01 00:00:00+00 BC
>> (5 rows)
>>
>> Zero year (and century) is accepted and negative years differs by 1
>> from what they should be.
>>
>>
>> I've written a patch fixes that. With it results are correct:
>> postgres=# SELECT
>> postgres-# y || '-06-01' as src
>> postgres-# ,CASE WHEN y>0 THEN ('00'||y||'-06-01') WHEN y<0 THEN
>> ('00'||(-y)||'-06-01 BC') END::date
>> postgres-# ,to_date(y || '-06-01', 'YYYY-MM-DD')
>> postgres-# ,to_timestamp(y || '-06-01', 'YYYY-MM-DD')
>> postgres-# FROM (VALUES(2),(1),(-1),(-2))t(y);
>> src | date | to_date | to_timestamp
>> ----------+---------------+---------------+---------------------------
>> 2-06-01 | 0002-06-01 | 0002-06-01 | 0002-06-01 00:00:00+00
>> 1-06-01 | 0001-06-01 | 0001-06-01 | 0001-06-01 00:00:00+00
>> -1-06-01 | 0001-06-01 BC | 0001-06-01 BC | 0001-06-01 00:00:00+00 BC
>> -2-06-01 | 0002-06-01 BC | 0002-06-01 BC | 0002-06-01 00:00:00+00 BC
>> (4 rows)
>>
>>
>> When year "0" is given, it raises an ERROR:
>> postgres=# SELECT to_timestamp('0000*01*01', 'YYYY*MM*DD');
>> ERROR: invalid input string for "YYYY"
>> DETAIL: Year cannot be 0.
>>
>>
>> Also I change behavior for era indicator when negatives century or
>> year are given. In such case era indicator is ignored (for me it is
>> obvious signs should be OR-ed):
>> postgres=# SELECT to_timestamp('-0010*01*01 BC', 'YYYY*MM*DD BC')
>> postgres-# ,to_timestamp(' 0010*01*01 BC', 'YYYY*MM*DD BC');
>> to_timestamp | to_timestamp
>> ---------------------------+---------------------------
>> 0010-01-01 00:00:00+00 BC | 0010-01-01 00:00:00+00 BC
>> (1 row)
>>
>>
>> Testings, complains, advice, comment improvements are very appreciated.
>
> This seems to be a messy topic. The usage of "AD" and "BC" imply that
> TO_DATE is using the anno domini system which doesn't have a year 0,
> but in the DATE type perhaps we are using the ISO 8601 model[2] where
> 1 BC is represented as 0000, leading to the difference of one in all
> years before 1 AD?
>
> [1] https://en.wikipedia.org/wiki/Anno_Domini
> [2] https://en.wikipedia.org/wiki/ISO_8601#Years
>
> --
> Thomas Munro
> http://www.enterprisedb.com
Thank you for fast reply and for the link[2]. Be honest I didn't know
ISO8601 specifies 1 BC as +0000.
But the documentation[3] doesn't points that ISO8601 is used for
"YYYY", but it is mentioned for "IYYY", and it is slightly deceiving.
Also I remember that the other part of the documentation says[4] that
"Keep in mind there is no 0 AD" that's why I decided it is impossible
to pass 0000 for YYYY.
Currently behavior with YYYY=0 is still surprising in some cases:
postgres=# SELECT
postgres-# to_date('20 0000-01-01', 'CC YYYY-MM-DD'),
postgres-# to_date('20 0001-01-01', 'CC YYYY-MM-DD');
to_date | to_date
------------+------------
1901-01-01 | 0001-01-01
(1 row)
but the documentation[3] says "In conversions from string to timestamp
or date, the CC (century) field is ignored if there is a YYY, YYYY or
Y,YYY field."
So is it shared opinion that ISO8601 is used for "YYYY"?
[3]http://www.postgresql.org/docs/devel/static/functions-formatting.html
[4]http://www.postgresql.org/docs/devel/static/functions-datetime.html
--
Best regards,
Vitaly Burovoy
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-02-23 03:31:44 | Re: postgres_fdw vs. force_parallel_mode on ppc |
Previous Message | Michael Paquier | 2016-02-23 01:52:08 | Re: Writing new unit tests with PostgresNode |