Re: truncating timestamps on arbitrary intervals

From: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To: Isaac Morland <isaac(dot)morland(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: truncating timestamps on arbitrary intervals
Date: 2020-03-15 06:26:07
Message-ID: CACPNZCuxeG1mP-6=qGJcMpHa8SmHaTXK-zXC8w19HWnu1N-zqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 13, 2020 at 7:48 PM Isaac Morland <isaac(dot)morland(at)gmail(dot)com> wrote:
>
> On Fri, 13 Mar 2020 at 03:13, John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> wrote:
>
>> - align weeks to start on Sunday
>> select date_trunc_interval('7 days'::interval, TIMESTAMP '2020-02-11
>> 01:01:01.0', TIMESTAMP '1900-01-02');
>> date_trunc_interval
>> ---------------------
>> 2020-02-09 00:00:00
>> (1 row)
>
>
> I'm confused by this. If my calendars are correct, both 1900-01-02 and 2020-02-11 are Tuesdays. So if the date being adjusted and the origin are both Tuesday, shouldn't the day part be left alone when truncating to 7 days?

Thanks for taking a look! The non-intuitive behavior you found is
because the patch shifts the timestamp before converting to the
internal pg_tm type. The pg_tm type stores day of the month, which is
used for the calculation. It's not counting the days since the origin.
Then the result is shifted back.

To get more logical behavior, perhaps the optional parameter is better
as an offset instead of an origin. Alternatively (or additionally),
the function could do the math on int64 timestamps directly.

> Also, I'd like to confirm that the default starting point for 7 day periods (weeks) is Monday, per ISO.

That's currently the behavior in the existing date_trunc function,
when passed the string 'week'. Given that keyword, it calculates the
week of the year.

When using the proposed function with arbitrary intervals, it uses day
of the month, as found in the pg_tm struct. It doesn't treat 7 days
differently then 5 or 10 without user input (origin or offset), since
there is nothing special about 7 day intervals as such internally. To
show the difference between date_trunc, and date_trunc_interval as
implemented in v5 with no origin:

select date_trunc('week', d), count(*) from generate_series(
'2020-02-01'::timestamp, '2020-03-31', '1 day') d group by 1 order by
1;
date_trunc | count
---------------------+-------
2020-01-27 00:00:00 | 2
2020-02-03 00:00:00 | 7
2020-02-10 00:00:00 | 7
2020-02-17 00:00:00 | 7
2020-02-24 00:00:00 | 7
2020-03-02 00:00:00 | 7
2020-03-09 00:00:00 | 7
2020-03-16 00:00:00 | 7
2020-03-23 00:00:00 | 7
2020-03-30 00:00:00 | 2
(10 rows)

select date_trunc_interval('7 days'::interval, d), count(*) from
generate_series( '2020-02-01'::timestamp, '2020-03-31', '1 day') d
group by 1 order by 1;
date_trunc_interval | count
---------------------+-------
2020-02-01 00:00:00 | 7
2020-02-08 00:00:00 | 7
2020-02-15 00:00:00 | 7
2020-02-22 00:00:00 | 7
2020-02-29 00:00:00 | 1
2020-03-01 00:00:00 | 7
2020-03-08 00:00:00 | 7
2020-03-15 00:00:00 | 7
2020-03-22 00:00:00 | 7
2020-03-29 00:00:00 | 3
(10 rows)

Resetting the day every month is counterintuitive if not broken, and
as I mentioned it might make more sense to use the int64 timestamp
directly, at least for intervals less than one month. I'll go look
into doing that.

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-03-15 07:45:19 Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Previous Message James Coleman 2020-03-15 02:55:23 Re: [PATCH] Incremental sort (was: PoC: Partial sort)