Re: CTE inlining

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CTE inlining
Date: 2017-05-02 06:23:26
Message-ID: CAMsr+YGFX3ohSsaZOb3JvRBwd03BBVvjwz8rZFcLV7TmJttt_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2 May 2017 at 10:45, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

> If we want fence behaviour, we should require people to declare their
> desire for fence behaviour, rather than treating it as a sort of
> hint-as-a-bug that we grandfather in because we're so desperate not to
> admit we have hints.

Answers to my question here https://dba.stackexchange.com/q/27425/7788
suggest that there's no justification in the standard for treating it
as a mandatory fence.

Here https://www.postgresql.org/message-id/29918.1320244719@sss.pgh.pa.us
Tom says

"CTEs are also treated as optimization fences; this is not so much an
optimizer limitation as to keep the semantics sane when the CTE contains
a writable query."

In my view, if we address that by walking the query tree for volatile
functions, we should be perfectly fine to inline single-reference
CTEs. Is that too simplistic?

Is the sticking point simply this docs text:

"A useful property of WITH queries is that they are evaluated only
once per execution of the parent query, even if they are referred to
more than once by the parent query or sibling WITH queries. Thus,
expensive calculations that are needed in multiple places can be
placed within a WITH query to avoid redundant work. Another possible
application is to prevent unwanted multiple evaluations of functions
with side-effects. However, the other side of this coin is that the
optimizer is less able to push restrictions from the parent query down
into a WITH query than an ordinary subquery. The WITH query will
generally be evaluated as written, without suppression of rows that
the parent query might discard afterwards. (But, as mentioned above,
evaluation might stop early if the reference(s) to the query demand
only a limited number of rows.)"

?

... because to me, that seems to offer a great deal of wiggle room in
"less able", "generally", etc, even if we treated the docs as a hard
specification of future behaviour. (Which we don't, since we change
things and document the changes).

We may also stop evaluation early, so there's already no guarantee the
full query is run.

Is there any *technical* rather than policy reason we could not safely
inline a CTE term referenced by only one query, where the CTE term is
a SELECT, contains no volatile function calls, is not RECURSIVE, and
does not its self reference another non-inlineable CTE term?

Josh Kupershmidt, in a comment on my blog post on this topic some time
ago (https://blog.2ndquadrant.com/postgresql-ctes-are-optimization-fences/)
pointed out that

"One notable exception to the “(non-writeable) CTEs are always
materialized” rule is that if a non-writeable CTE node is not
referenced anywhere, it won’t actually be evaluated. For example, this
query returns 1 instead of bombing out:

WITH not_executed AS (SELECT 1/0),

executed AS (SELECT 1)

SELECT * FROM executed;"

Prior discussions:

* https://www.postgresql.org/message-id/201209191305.44674.db@kavod.com
* http://archives.postgresql.org/pgsql-performance/2011-10/msg00208.php
* https://www.postgresql.org/message-id/29918.1320244719@sss.pgh.pa.us

Relevant posts where users get confused by our behaviour:

* https://dba.stackexchange.com/q/127828/7788
* https://news.ycombinator.com/item?id=7023907
* https://dba.stackexchange.com/q/84760/7788
* https://dba.stackexchange.com/q/97393/7788
* http://stackoverflow.com/q/20403792/398670
* http://stackoverflow.com/q/33731068/398670
* ....

Also, comments on my post:

* https://blog.2ndquadrant.com/postgresql-ctes-are-optimization-fences/

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2017-05-02 06:24:49 Re: Potential hot-standby bug around xacts committed but in xl_running_xacts
Previous Message Simon Riggs 2017-05-02 05:12:41 Re: Potential hot-standby bug around xacts committed but in xl_running_xacts