Re: date_trunc function in interval version

From: Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: date_trunc function in interval version
Date: 2024-05-18 21:19:56
Message-ID: 2351923c-9c03-65d7-9944-80e3be7f0498@sztoch.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas wrote on 5/15/2024 9:29 PM:
> On Mon, Mar 4, 2024 at 5:03 AM Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl> wrote:
>> Apparently the functionality is identical to date_bin.
>> When I saw date_bin in the documentation, I thought it solved all my problems.
>> Unfortunately, DST problems have many corner cases.
>> I tried to change date_bin several times, but unfortunately in some cases it would start working differently than before.
> So, first of all, thanks for taking an interest and sending a patch.
>
> In order for the patch to have a chance of being accepted, we would
> need to have a clear understanding of exactly how this patch is
> different from the existing date_bin(). If we knew that, we could
> decide either that (a) date_bin does the right thing and your patch
> does the wrong thing and therefore we should reject your patch, or we
> could decide that (b) date_bin does the wrong thing and therefore we
> should fix it, or we could decide that (c) both date_bin and what this
> patch does are correct, in the sense of being sensible things to do,
> and there is a reason to have both. But if we don't really understand
> how they are different, which seems to be the case right now, then we
> can't make any decisions. And what that means in practice is that
> nobody is going to be willing to commit anything, and we're just going
> to go around in circles.
>
> Typically, this kind of research is the responsibility of the patch
> author: you're the one who wants something changed, so that means you
> need to provide convincing evidence that it should be. If someone else
> volunteers to do it, that's also cool, but it absolutely has to be
> done in order for there to be a chance of progress here. No committer
> is going to say "well, we already have date_bin, but Przemysław says
> his date_trunc is different somehow, so let's have both without
> understanding how exactly they're different." That's just not a
> realistic scenario. Just to name one problem, how would we document
> each of them? Users would expect the documentation to explain how two
> closely-related functions differ, but we will be unable to explain
> that if we don't know the answer ourselves.
>
> If you can't figure out exactly what the differences are by code
> inspection, then maybe one thing you could do to help unblock things
> here is provide some very clear examples of when they deliver the same
> results and when they deliver different results. Although there are no
> guarantees, that might lead somebody else to jump in and suggest an
> explanation, or further avenues of analysis, or some other helpful
> comment.
>
> Personally, what I suspect is that there's already a way to do what
> you want using date_bin(), maybe in conjunction with some casting or
> some calls to other functions that we already have. But it's hard to
> be sure because we just don't have the details. "DST problems have
> many corner cases" and "in some cases [date_bin] would start working
> differently than before" may be true statements as far as they go, but
> they're not very specific complaints. If you can describe *exactly*
> how date_bin fails to meet your expectations, there is a much better
> chance that something useful will happen here.
>
I would also like to thank Robert for presenting the matter in detail.

My function date_trunc ( interval, timestamp, ...) is similar to
original function date_trunc ( text, timestamp ...) .

My extension only gives more granularity.
We don't have a jump from hour to day. We can use 6h and 12h. It's the
same with minutes.
We can round to 30 minutes, 20minutes, 15 minutes, etc.

Using date_bin has a similar effect, but requires specifying the origin.
According to this origin,
subsequent buckets are then calculated. The need to provide this origin
is sometimes a very big problem.
Especially since you cannot use one origin when changing from summer to
winter time.

If we use one origin for example begin of year: 2024-01-01 00:00:00 then:
# SET timezone='Europe/Warsaw';
# SELECT date_bin('1 day', '2024-03-05 11:22:33', '2024-01-01
00:00:00'), date_trunc('day', '2024-03-05 11:22:33'::timestamptz);
2024-03-05 00:00:00+01 2024-03-05 00:00:00+01    date_bin works ok,
because we are before DST
# SELECT date_bin('1 day', '2024-05-05 11:22:33', '2024-01-01
00:00:00'), date_trunc('day', '2024-05-05 11:22:33'::timestamptz);
2024-05-05 01:00:00+02 2024-05-05 00:00:00+02    date_bin has problem,
because we are in May after DST

If anyone has an idea how to make date_bin work like date_trunc, please
provide an example.

--
Przemysław Sztoch | Mobile +48 509 99 00 66

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josef Šimánek 2024-05-18 21:23:09 Re: Ignore Visual Studio's Temp Files While Working with PG on Windows
Previous Message Tom Lane 2024-05-18 21:16:56 Re: Ignore Visual Studio's Temp Files While Working with PG on Windows