Re: wCTE behaviour

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Yeb Havinga <yebhavinga(at)gmail(dot)com>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, David Fetter <david(at)fetter(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: wCTE behaviour
Date: 2010-11-12 16:50:46
Message-ID: AANLkTi=Sf6nXxTwR_ik7TensYsnSctPV=RC1uDEMAnWj@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2010/11/13 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Fri, Nov 12, 2010 at 10:25 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Yeb Havinga <yebhavinga(at)gmail(dot)com> writes:
>>> On 2010-11-11 17:50, Marko Tiikkaja wrote:
>>>> Just to be clear, the main point is whether they see the data
>>>> modifications or not.  The simplest case to point out this behaviour is:
>>>>
>>>> WITH t AS (DELETE FROM foo)
>>>> SELECT * FROM foo;
>>>>
>>>> And the big question is: what state of "foo" should the SELECT
>>>> statement see?
>>
>>> Since t is not referenced in the query, foo should not be deleted at
>>> all,
>>
>> Yeah, that's another interesting question: should we somehow force
>> unreferenced CTEs to be evaluated anyhow?  Now that I think about it,
>> there was also some concern about the possibility of the outer query
>> not reading the CTE all the way to the end, ie
>>
>>        WITH t AS (DELETE FROM foo RETURNING *)
>>        SELECT * FROM t LIMIT 1;
>>
>> How many rows does this delete?  I think we concluded that we should
>> force the DELETE to be run to conclusion even if the outer query didn't
>> read it all.  From an implementation standpoint that makes it more
>> attractive to do the DELETE first and stick its results in a tuplestore
>> --- but I still think we should view that as an implementation detail,
>> not as part of the specification.
>
> Yeah, I think we have to force any DML statements in CTEs to run to
> completion, whether we need the results or not, and even if they are
> unreferenced.  Otherwise it's going to be really confusing, I fear.

One thing that has annoyed me while designing this feature is if as
Tom suggests the all queries are executed in the same snapshot and
optimized as the current read-only CTE does we are tempted to support
recursive and forward-reference in even DML CTE. It explodes out my
head and I'd like not to think about it if we can.

On the other hand, different-snapshot, serialized execution model
occurs the problem I originally rose in the previous thread, in which
the space to store the data shared among different plans is missing.
It's of course doable, but the easier implementation the better.

I'm inclined to agree with the same snapshot model, that is not only
easier to implement but also fits the current SQL processing design
and the existing CTE specification. Not only from the developer's view
but consistency from user's view. Whatever the standard says on the
DML *subquery*, we're going to create our new *CTE* feature. Yes, this
is CTE. For recursive and forward-reference issue, we can just forbid
them in DML CTE at first.

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2010-11-12 17:03:14 Re: Refactoring the Type System
Previous Message Bruce Momjian 2010-11-12 16:47:27 Re: Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running