Re: Strange behavior of function date_trunc

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Pavel Luzanov <p(dot)luzanov(at)postgrespro(dot)ru>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Strange behavior of function date_trunc
Date: 2021-05-05 13:55:04
Message-ID: 8986e9dd-576a-3c3b-b523-3052519387fd@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 5/5/21 3:23 PM, Pavel Luzanov wrote:
> Hello,
>
> It is very likely that the date_trunc function in the following example
> is executed for each line of the query. Although it marked as a STABLE
> and could only be called once.
>

It could, but that's just an option - the database may do that, but it's
not required to do it. In this case it might be beneficial, but it'd
make the planner more complex etc.

> EXPLAIN (ANALYZE)
> SELECT * FROM generate_series('2021-01-01', '2021-06-01', '1
> s'::interval) AS g(x) WHERE g.x >= date_trunc('day',
> '2021-05-05'::timestamptz);
>                                                           QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------
>
>  Function Scan on generate_series g  (cost=0.00..15.00 rows=333
> width=8) (actual time=2801.884..3263.328 rows=2332801 loops=1)
>    Filter: (x >= date_trunc('day'::text, '2021-05-05
> 00:00:00+03'::timestamp with time zone))
>    Rows Removed by Filter: 10713600
>  Planning Time: 0.040 ms
>  Execution Time: 3336.657 ms
>
> When replacing date_trunc with now, the query is much faster:
>
> EXPLAIN (ANALYZE)
> SELECT * FROM generate_series('2021-01-01', '2021-06-01', '1
> s'::interval) AS g(x)
> WHERE g.x >= now();
>                                                           QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------
>
>  Function Scan on generate_series g  (cost=0.00..15.00 rows=333
> width=8) (actual time=1648.777..1845.430 rows=2275325 loops=1)
>    Filter: (x >= now())
>    Rows Removed by Filter: 10771076
>  Planning Time: 0.039 ms
>  Execution Time: 1918.767 ms
>
> The variant with now works almost as fast as with the constant. This
> suggests me that perhaps date_trunc is being executed for every line of
> the query:
>
> EXPLAIN (ANALYZE)
> SELECT * FROM generate_series('2021-01-01', '2021-06-01', '1
> s'::interval) AS g(x)
> WHERE g.x >= '2021-05-05'::timestamptz;
>                                                           QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------
>
>  Function Scan on generate_series g  (cost=0.00..12.50 rows=333
> width=8) (actual time=1628.743..1826.841 rows=2332801 loops=1)
>    Filter: (x >= '2021-05-05 00:00:00+03'::timestamp with time zone)
>    Rows Removed by Filter: 10713600
>  Planning Time: 0.033 ms
>  Execution Time: 1901.680 ms
>
> In this regard, I have two questions:
> 1. How can I find out exactly how many times the date_trunc function has
> been executed? So far, these are just my assumptions.
> 2. If date_trunc is indeed called multiple times, why is this happening?
>

Well, it'd not like date_trunc is executed for each row while now() is
executed only once. The functions are executed for each row in both
cases, but now() is simply much cheaper - it just returns a value that
is already calculated, while date_trunc has to parse and truncate the
value, etc.

You can use CTE to execute it just once, I think:

with x as (select date_trunc('day', '2021-04-01'::timestamptz) as x)
select * from t where a > (select x from x);

regards
Tomas

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2021-05-05 14:02:14 Re: PostgreSQL upgrade from 10 to 12 fails with "ERROR: column r.relhasoids does not exist at character 1616"
Previous Message Pavel Luzanov 2021-05-05 13:23:09 Strange behavior of function date_trunc