Re: to_date() and to_timestamp() with negative years

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Bryn Llewellyn <bryn(at)yugabyte(dot)com>
Cc: Tom Lane PostgreSQL <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general list <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: to_date() and to_timestamp() with negative years
Date: 2021-11-04 04:42:01
Message-ID: e1c27b01-e7c4-a5dc-ff2c-dcd0fccb1e4f@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 11/3/21 19:12, Bryn Llewellyn wrote:
>> /adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com> wrote:/
>>

> Back to the point about separators, the "Current" doc has this bullet:
>
> «
> A separator (a space or non-letter/non-digit character) in the template
> string of to_timestamp and to_date matches any single separator in
> the input string or is skipped, unless the FX option is used...
> »
>
> (There's no such explanation in the Version 11 doc—but never mind that.)
> I read this to mean that a space IS a viable separator. And yet Tom's
> "nope, the space doesn't count [as a separator]" claims the opposite.
> The bullet's wording, by the way, says that the minus sign is a
> separator. But yet it can also be taken to confer the meaning "BC" to a
> year. No wonder I'm confused.
>
> Elsewhere the "Current" doc says that runs of two or more spaces have
> the same effect as a single space (in the absence of FX or FM complexity).
>
> No wonder that examples like I showed sometimes produce the wrong
> results, even after more than one round of tinkering with the C
> implementation. Try this test (the to-be-converted text has runs of five
> spaces, and the template has runs of ten spaces):

It comes down to determining what is a separator and what is the
negative sign.

>
> select to_date('     1950     02     14', 'YYYY          MM          DD');
> select to_date('     -1950     02     14', 'YYYY          MM          DD');

The row above returns:

select to_date(' -1950 02 14', 'YYYY MM DD');
to_date
---------------
1950-02-14 BC

As you state below.

Change it to:

select to_date(' -1950 02 14', ' YYYY MM DD');
to_date
------------
1950-02-14

and you don't get the BC as -1950 is taken as separator(-)1950 not
negative(-)1950.

>
> select to_date('     14     02     1950', 'DD          MM          YYYY');
> select to_date('     14     02     -1950', 'DD          MM          YYYY');
The above returns:

select to_date(' 14 02 -1950', 'DD MM YYYY');
to_date
------------
1950-02-14

Change it to:

select to_date(' 14 02 -1950', 'DD MMYYYY');
to_date
---------------
1950-02-14 BC

and you get BC as -1950 is taken as negative(-)1950 not separator(-)1950.

Not sure how this can be handled in a totally predictable way given the
unpredictable ways in which datetime strings are formatted?

The only thing I can say it is it points out that when working with
datetimes settling on a standard format is your best defense against
unpredictable results.

>
> "-1950" is taken as "BC" in the second to_date() but it is not so taken
> in the fourth to_date().
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alex Magnum 2021-11-04 11:42:12 Re: Error with Insert from View with ON Conflict
Previous Message Bryn Llewellyn 2021-11-04 02:12:39 Re: to_date() and to_timestamp() with negative years