Re: Have I found an interval arithmetic bug?

From: Bryn Llewellyn <bryn(at)yugabyte(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane PostgreSQL <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, John W Higgins <wishdev(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Have I found an interval arithmetic bug?
Date: 2021-04-13 17:55:31
Message-ID: B117C7D3-E4F9-44FE-9F7B-D5CE3C137304@yugabyte.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

> On 12-Apr-2021, at 17:25, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> On Mon, Apr 12, 2021 at 05:20:43PM -0700, Bryn Llewellyn wrote:
>> I’d argue that the fact that this:
>>
>> ('0.3 months'::interval) + ('0.7 months'::interval)
>>
>> Is reported as '30 days' and not '1 month' is yet another
>> bug—precisely because of what I said in my previous email (sorry
>> that I forked the thread) where I referred to the fact that, in the
>> right test, adding 1 month gets a different answer than adding 30
>> days.
>
> Flowing _up_ is what these functions do:
>
> \df *justify*
> List of functions
> Schema | Name | Result data type | Argument data types | Type
> ------------+------------------+------------------+---------------------+------
> pg_catalog | justify_days | interval | interval | func
> pg_catalog | justify_hours | interval | interval | func
> pg_catalog | justify_interval | interval | interval | func
>
>> Yet another convincing reason to get rid of this flow down
>> business altogether.
>
> We can certainly get rid of all downflow, which in the current patch is
> only when fractional internal units are specified.
>
>> If some application wants to model flow-down, then it can do so with
>> trivial programming and full control over its own definition of the
>> rules.

“Yes please!” re Bruce’s “We can certainly get rid of all downflow, which in the current patch is only when fractional internal units are specified.”

Notice that a user who creates interval values explicitly only by using “make_interval()” will see no behavior change.

There’s another case of up-flow. When you subtract one timestamp value from another, and when they’re far enough apart, then the (internal representation of the) resulting interval value has both a seconds component and a days component. (But never, in my tests, a months component.) I assume that the implementation first gets the difference between the two timestamp values in seconds using (the moral equivalent of) “extract epoch”. And then, if this is greater than 24*60*60, it implements up-flow using the “rule-of-24”—never mind that this means that if you add the answer back to the timestamp value that you subtracted, then you might not get the timestamp value from which you subtracted. (This happens around a DST change and has been discussed earlier in the thread.)

The purpose of these three “justify” functions is dubious. I think that it’s best to think of the 3-component interval vector like an [x, y, z] vector in 3d geometric space, where the three coordinates are not mutually convertible because each has unique semantics. This point has been rehearsed many times in this long thread.

Having said this, it isn’t hard to understand the rules that the functions implement. And, most importantly, their use is voluntary. They are, though, no more than shipped and documented wrappers for a few special cases. A user could so easily write their own function like this:

1. Compute the values of the three components of the internal representation of the passed-in interval value using the “extract” feature and some simple arithmetic.

2. Derive the [minutes, days, seconds] values of a new representation using whatever rules you feel for.

3. Use these new values to create the return interval value.

For example, I might follow a discipline to use interval values that have only one of the three components of the internal representation non-zero. It’s easy to use the “domain” feature for this. (I can use these in any context where I can use the shipped interval.) I could then write a function to convert a “pure seconds” value of my_interval to a “pure years” value. And I could implement my own rules:

— Makes sense only for a large number of seconds that comes out to at least five years (else assert failure).

— Converts seconds to years using the rule that 1 year is, on average, 365.25*24*60*60 seconds, and then truncates it.

There’s no shipped function that does this, and this makes me suspect that I’d prefer to roll my own for any serious purpose.

The existence of the three “justify” functions is, therefore, harmless.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dmitry Koterov 2021-04-13 18:02:48 Suboptimal plan when IN(...), ORDER BY and LIMIT are used (no JOINs)
Previous Message LE MENTEC, SANDRINE 2021-04-13 15:04:57 RE: looking for a installation package to Using GSSAPI with Postgres12 for windows

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2021-04-13 17:58:48 Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY
Previous Message James Coleman 2021-04-13 17:40:01 Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays