Re: behavior of GROUP BY with VOLATILE expressions

From: Paul George <p(dot)a(dot)george19(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Richard Guo <guofenglinux(at)gmail(dot)com>
Subject: Re: behavior of GROUP BY with VOLATILE expressions
Date: 2024-07-19 21:21:05
Message-ID: CALA8mJqP75PUc9sJshi=KivnaoD2iD4KPk7GQz=xDwq6tp9ftA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Great, thanks for the links and useful past discussions! I figured I wasn't
the first to stumble across this, and it's interesting to see the issue
arise with ORDER BY [VOLATILE FUNC] as well.

My question was not so much about changing behavior as it was about
understanding what is desired, especially in light of the fact that
subqueries behave differently. From my reading of the links you provided,
it seems that even the notion of "desired" here is itself dubious and that
there is a case for reevaluating RANDOM() everywhere and a case for not
doing that. Given this murkiness, is it fair then to say that drawing
parallels between how GROUP BY subquery is handled is moot?

-Paul-

On Fri, Jul 19, 2024 at 7:48 AM David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> On Fri, Jul 19, 2024 at 7:20 AM Paul George <p(dot)a(dot)george19(at)gmail(dot)com>
> wrote:
>
>>
>> I wanted to surface a discussion in [1] regarding the expected behavior
>> of GROUP BY with VOLATILE expressions. There seems to be a discrepancy
>> between how volatile functions (RANDOM(), also confirmed with TIMEOFDAY())
>> and subqueries are evaluated in groups. In the examples below, volatile
>> functions do not always appear to be evaluated per-call (evidenced by
>> looking at EXPLAIN or results) whereas scalar subqueries always appear to
>> be independently evaluated.
>>
>> Based on the docs, "A query using a volatile function will re-evaluate
>> the function at every row where its value is needed," it seems that the
>> handling of subqueries is correct and that each call to RANDOM() should be
>> evaluated (not the current behavior). But, what is correct/anticipated?
>>
>>
> The observed behaviors are unlikely to change. Prior discussions can be
> found regarding this:
>
>
> https://www.postgresql.org/message-id/flat/CZHAF947QQQO.27MAUK2SVMBXW%40nmfay.com
>
> David J.
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-07-19 21:21:37 Re: optimizing pg_upgrade's once-in-each-database steps
Previous Message Thomas Munro 2024-07-19 21:04:53 Re: Build with LTO / -flto on macOS