Re: SEARCH and CYCLE clauses

From: Vik Fearing <vik(at)postgresfriends(dot)org>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SEARCH and CYCLE clauses
Date: 2020-05-22 10:40:12
Message-ID: 9ce4d370-048c-1d67-fffb-a86f17120132@postgresfriends.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/22/20 11:24 AM, Peter Eisentraut wrote:
> On 2020-05-20 21:28, Vik Fearing wrote:
>> 1)
>> There are some smart quotes in the comments that should be converted to
>> single quotes.
>
> ok, fixing that
>
>> 2)
>> This query is an infinite loop, as expected:
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>    table a;
>>
>> But it becomes an error when you add a cycle clause to it:
>>
>>    with recursive a as (select 1 as b union all table a)
>>      cycle b set c to true default false using p
>>    table a;
>>
>>    ERROR:  each UNION query must have the same number of columns
>
> table a expands to select * from a, and if you have a cycle clause, then
> a has three columns, but the other branch of the union only has one, so
> that won't work anymore, will it?

It seems there was a copy/paste error here. The first query should have
been the same as the second but without the cycle clause.

It seems strange to me that adding a <search or cycle clause> would
break a previously working query. I would rather see the * expanded
before adding the new columns. This is a user's opinion, I don't know
how hard that would be to implement.

>> 3)
>> If I take the same infinite loop query but replace the TABLE syntax with
>> a SELECT and add a cycle clause, it's not an infinite loop anymore.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to true default false using p
>>    table a;
>>
>>   b | c |     p
>> ---+---+-----------
>>   1 | f | {(1)}
>>   1 | t | {(1),(1)}
>> (2 rows)
>>
>> Why does it stop?  It should still be an infinite loop.
>
> If you specify the cycle clause, then the processing will stop if it
> sees the same row more than once, which it did here.

Yes, this was a misplaced expectation on my part.

>> 4)
>> If I use NULL instead of false, I only get one row back.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to true default false using p
>>    table a;
>>
>>   b | c |   p
>> ---+---+-------
>>   1 |   | {(1)}
>> (1 row)
>
> If you specify null, then the cycle check expression will always fail,
> so it will abort after the first row.  (We should perhaps prohibit
> specifying null, but see below.)

I would rather make the cycle check expression work with null.

>> 5)
>> I can set both states to the same value.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to true default true using p
>>    table a;
>
>> This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
>>   BTW, I applaud your decision to violate the other part of that rule and
>> allowing any data type here.
>>
>>
>> 5)
>> The same rule as above says that the value and the default value must be
>> literals but not everything that a human might consider a literal is
>> accepted.  In particular:
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>      cycle b set c to 1 default -1 using p
>>    table a;
>>
>>    ERROR:  syntax error at or near "-"
>>
>> Can we just accept a full a_expr here instead of AexprConst?  Both
>> DEFAULT and USING are fully reserved keywords.
>
> This is something we need to think about.  If we want to check at parse
> time whether the two values are not the same (and perhaps not null),
> then we either need to restrict the generality of what we can specify,
> or we need to be prepared to do full expression evaluation in the
> parser.  A simple and practical way might be to only allow string and
> boolean literal.  I don't have a strong opinion here.

I'm with Pavel on this one. Why does it have to be a parse-time error?
Also, I regularly see people write functions as sort of pauper's
variables, but a function call isn't allowed here.

----

Another bug I found. If I try to do your regression test as an
autonomous query, I get this:

with recursive

graph (f, t, label) as (
values (1, 2, 'arc 1 -> 2'),
(1, 3, 'arc 1 -> 3'),
(2, 3, 'arc 2 -> 3'),
(1, 4, 'arc 1 -> 4'),
(4, 5, 'arc 4 -> 5'),
(5, 1, 'arc 5 -> 1')
),

search_graph (f, t, label) as (
select * from graph g
union all
select g.*
from graph g, search_graph sg
where g.f = sg.t
)
cycle f, t set is_cycle to true default false using path

select * from search_graph;

ERROR: could not find CTE "graph"

----

As an improvement over the spec, I think the vast majority of people
will be using simple true/false values. Can we make that optional?

CYCLE f, t SET is_cycle USING path

would be the same as

CYCLE f, t SET is_cycle TO true DEFAULT false USING path
--
Vik Fearing

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message mkruk 2020-05-22 10:46:48 Re: Failed rpm package signature checks with reposync
Previous Message Pavel Stehule 2020-05-22 09:33:30 Re: SEARCH and CYCLE clauses