behavior of GROUP BY with VOLATILE expressions

From: Paul George <p(dot)a(dot)george19(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org, Richard Guo <guofenglinux(at)gmail(dot)com>
Subject: behavior of GROUP BY with VOLATILE expressions
Date: 2024-07-19 14:20:33
Message-ID: CALA8mJrDQhL-kntd=ypBgwvogL8=kspn5za1Mxv+mS=dinL5Sg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hackers,

I wanted to surface a discussion in [1] regarding the expected behavior of
GROUP BY with VOLATILE expressions. There seems to be a discrepancy between
how volatile functions (RANDOM(), also confirmed with TIMEOFDAY()) and
subqueries are evaluated in groups. In the examples below, volatile
functions do not always appear to be evaluated per-call (evidenced by
looking at EXPLAIN or results) whereas scalar subqueries always appear to
be independently evaluated.

Based on the docs, "A query using a volatile function will re-evaluate the
function at every row where its value is needed," it seems that the
handling of subqueries is correct and that each call to RANDOM() should be
evaluated (not the current behavior). But, what is correct/anticipated?

[version: PostgreSQL 16.3]

-- ### grouping volatile functions
-- single evaluation of RANDOM()
select random(), random(), random() group by 1;
random | random | random
--------------------+--------------------+--------------------
0.5156775158087117 | 0.5156775158087117 | 0.5156775158087117
(1 row)

-- two evaluations of RANDOM()
select random(), random(), random() group by 1, 2;
random | random | random
---------------------+---------------------+---------------------
0.36612763448670793 | 0.23423805164449374 | 0.36612763448670793
(1 row)

-- three evaluations of RANDOM()
select random(), random(), random() group by 1, 2, 3;
random | random | random
--------------------+--------------------+--------------------
0.2292929455776751 | 0.6613628224046473 | 0.5367692073422399
(1 row)

-- single evaluation of RANDOM()
select random(), random(), random() group by random();
random | random | random
--------------------+--------------------+--------------------
0.3069805404158834 | 0.3069805404158834 | 0.3069805404158834
(1 row)

-- single evaluation of RANDOM()
select random(), random(), random() group by random(), random();
random | random | random
--------------------+--------------------+--------------------
0.2860459945718521 | 0.2860459945718521 | 0.2860459945718521
(1 row)

-- single evaluation of RANDOM()
select random(), random(), random() group by random(), random(), random();
random | random | random
--------------------+--------------------+--------------------
0.3249129391658361 | 0.3249129391658361 | 0.3249129391658361
(1 row)

-- ### grouping scalar subqueries
-- each subquery evaluated
select (select random()), (select random()), (select random()) group by 1;
random | random | random
---------------------+--------------------+--------------------
0.30149979064538757 | 0.7911979526441186 | 0.5251471322291046
(1 row)

-- each subquery evaluated
select (select random()), (select random()), (select random()) group by
(select random());
random | random | random
--------------------+--------------------+----------------------
0.3411533489925591 | 0.4359004781684166 | 0.018305770511828356
(1 row)

-- ### sample EXPLAINs
-- two evaluations of RANDOM()
explain (verbose, costs off) select random(), random(), random() group by
1, 2;
QUERY PLAN
----------------------------------------------
HashAggregate
Output: (random()), (random()), (random())
Group Key: random(), random()
-> Result
Output: random(), random()
(5 rows)

-- singe evaluation of RANDOM()
explain (verbose, costs off) select random(), random(), random() group by
random(), random();
QUERY PLAN
----------------------------------------------
HashAggregate
Output: (random()), (random()), (random())
Group Key: random()
-> Result
Output: random()
(5 rows)

-Paul-

[1]
https://www.postgresql.org/message-id/CAMbWs4_xhRAfy-i%3D%3DnFMZGukw4M%3DOnkfwpfEfiGmAx6a3SYBKw%40mail.gmail.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2024-07-19 14:22:18 Re: Document DateStyle effect on jsonpath string()
Previous Message Robert Haas 2024-07-19 14:14:14 Re: Set log_lock_waits=on by default