From: | Paul George <p(dot)a(dot)george19(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org, Richard Guo <guofenglinux(at)gmail(dot)com> |
Subject: | behavior of GROUP BY with VOLATILE expressions |
Date: | 2024-07-19 14:20:33 |
Message-ID: | CALA8mJrDQhL-kntd=ypBgwvogL8=kspn5za1Mxv+mS=dinL5Sg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hackers,
I wanted to surface a discussion in [1] regarding the expected behavior of
GROUP BY with VOLATILE expressions. There seems to be a discrepancy between
how volatile functions (RANDOM(), also confirmed with TIMEOFDAY()) and
subqueries are evaluated in groups. In the examples below, volatile
functions do not always appear to be evaluated per-call (evidenced by
looking at EXPLAIN or results) whereas scalar subqueries always appear to
be independently evaluated.
Based on the docs, "A query using a volatile function will re-evaluate the
function at every row where its value is needed," it seems that the
handling of subqueries is correct and that each call to RANDOM() should be
evaluated (not the current behavior). But, what is correct/anticipated?
[version: PostgreSQL 16.3]
-- ### grouping volatile functions
-- single evaluation of RANDOM()
select random(), random(), random() group by 1;
random | random | random
--------------------+--------------------+--------------------
0.5156775158087117 | 0.5156775158087117 | 0.5156775158087117
(1 row)
-- two evaluations of RANDOM()
select random(), random(), random() group by 1, 2;
random | random | random
---------------------+---------------------+---------------------
0.36612763448670793 | 0.23423805164449374 | 0.36612763448670793
(1 row)
-- three evaluations of RANDOM()
select random(), random(), random() group by 1, 2, 3;
random | random | random
--------------------+--------------------+--------------------
0.2292929455776751 | 0.6613628224046473 | 0.5367692073422399
(1 row)
-- single evaluation of RANDOM()
select random(), random(), random() group by random();
random | random | random
--------------------+--------------------+--------------------
0.3069805404158834 | 0.3069805404158834 | 0.3069805404158834
(1 row)
-- single evaluation of RANDOM()
select random(), random(), random() group by random(), random();
random | random | random
--------------------+--------------------+--------------------
0.2860459945718521 | 0.2860459945718521 | 0.2860459945718521
(1 row)
-- single evaluation of RANDOM()
select random(), random(), random() group by random(), random(), random();
random | random | random
--------------------+--------------------+--------------------
0.3249129391658361 | 0.3249129391658361 | 0.3249129391658361
(1 row)
-- ### grouping scalar subqueries
-- each subquery evaluated
select (select random()), (select random()), (select random()) group by 1;
random | random | random
---------------------+--------------------+--------------------
0.30149979064538757 | 0.7911979526441186 | 0.5251471322291046
(1 row)
-- each subquery evaluated
select (select random()), (select random()), (select random()) group by
(select random());
random | random | random
--------------------+--------------------+----------------------
0.3411533489925591 | 0.4359004781684166 | 0.018305770511828356
(1 row)
-- ### sample EXPLAINs
-- two evaluations of RANDOM()
explain (verbose, costs off) select random(), random(), random() group by
1, 2;
QUERY PLAN
----------------------------------------------
HashAggregate
Output: (random()), (random()), (random())
Group Key: random(), random()
-> Result
Output: random(), random()
(5 rows)
-- singe evaluation of RANDOM()
explain (verbose, costs off) select random(), random(), random() group by
random(), random();
QUERY PLAN
----------------------------------------------
HashAggregate
Output: (random()), (random()), (random())
Group Key: random()
-> Result
Output: random()
(5 rows)
-Paul-
From | Date | Subject | |
---|---|---|---|
Next Message | David E. Wheeler | 2024-07-19 14:22:18 | Re: Document DateStyle effect on jsonpath string() |
Previous Message | Robert Haas | 2024-07-19 14:14:14 | Re: Set log_lock_waits=on by default |