From: | "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> |
---|---|
To: | "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu> |
Cc: | "Florian Schoppmann" <Florian(dot)Schoppmann(at)emc(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>,<pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Invalid optimization of VOLATILE function in WHERE clause? |
Date: | 2012-09-19 20:26:03 |
Message-ID: | 5059E40B020000250004A521@gw.wicourts.gov |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu> wrote:
> On Wed, Sep 19, 2012 at 02:39:12PM -0500, Kevin Grittner wrote:
>> In another thread, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> 2. Apply the WHERE condition to each row from 1, and drop rows
>>> that don't pass it.
>>
>> People expect that the results will be consistent with this
>> model, even if the implementation is optimized "under the
>> covers". I think correct semantics should trump performance
>> here.
> It seems like this is what happens here except that the function
> is evaluated once for the WHERE and not once per ROW. Both of
> these meet the criterion for 2 above and Tom's earlier comments
> both hold.
There really needs to be some way to specify that when an expression
is evaluated for each row in a set, a function used within that
expression is not optimized away for some rows. Fortunately we have
a way:
http://www.postgresql.org/docs/9.2/interactive/sql-createfunction.html
| VOLATILE indicates that the function value can change even within
| a single table scan, so no optimizations can be made. Relatively
| few database functions are volatile in this sense; some examples
| are random(), [...]
The behavior in the OP's query would certainly be sane if the
function were not VOLATILE; as it is, I have a hard time seeing this
as anything but a bug.
There is a workaround, if you don't mind ugly:
CREATE FUNCTION random_really_i_mean_it(dummy int)
RETURNS double precision
LANGUAGE plpgsql
VOLATILE
AS $$
BEGIN
-- no need to reference dummy parameter
RETURN random();
END;
$$;
WITH source AS (
SELECT i FROM generate_series(1,10) AS i
)
SELECT
i
FROM
source, (
SELECT
count(*) AS _n
FROM source
) AS _stats
WHERE
random_really_i_mean_it(i) < 5::DOUBLE PRECISION/_n;
-Kevin
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2012-09-19 20:30:34 | Re: Invalid optimization of VOLATILE function in WHERE clause? |
Previous Message | Simon Riggs | 2012-09-19 20:13:51 | Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database) |