Re: [PATCH] Allow multiple recursive self-references

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Denis Hirn <denis(dot)hirn(at)uni-tuebingen(dot)de>
Cc: Pantelis Theodosiou <ypercube(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Allow multiple recursive self-references
Date: 2021-03-23 19:29:42
Message-ID: 988140.1616527782@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Denis Hirn <denis(dot)hirn(at)uni-tuebingen(dot)de> writes:
>> I am not at all sure what the standard says about such recursion [...]

> as far as I know, the standard does not constraint the number of self-references
> of recursive common table expressions. However, I could be wrong here.

As far as I can see, the spec flat-out forbids this. In SQL:2021,
it's discussed in 7.17 <query expression> syntax rule 3) j) ix), which
defines

ix) If WLEi is recursive, then:
1) Let S be the stratum that contains WQNi.
2) If WQEi does not contain a <query specification> that contains
more than one <query name> referencing members of S, then WLEi is
linearly recursive.

("stratum" means a set of mutually-recursive WITH items), and they
helpfully explain

NOTE 308 — For example, if WLEi contains the <query specification>
SELECT * FROM A AS A1, A AS A2
where A is a <query name> in S, then WLEi is not linearly recursive. The
point is that this <query specification> contains two references to
members of S. It is irrelevant that they reference the same member of S.

and then the next rule says

x) If WLEi is recursive, then WLEi shall be linearly recursive.

So the problem with extending the spec here is that someday they might
extend it with some other semantics, and then we're between a rock and
a hard place.

If there were really compelling reasons why (a) we have to have this
feature and (b) there's only one reasonable way for it to act, hence
no likelihood that the SQL committee will choose something else, then
I'd be willing to think about getting out front of the committee.
But you haven't made either case.

>> I don't think any other DBMS has implemented this, except MariaDB. Tested here:

> There are a few recent DBMSs that I know of that support this: HyPer, Umbra, DuckDB, and NoisePage.

Do they all act the same? Has anybody that sits on the SQL committee
done it? (If you could point to DB2, in particular, I'd have some
confidence that the committee might standardize on their approach.)

A totally independent question is whether you've even defined a
well-defined behavior. There's an interesting comment in the spec:

NOTE 310 — The restrictions insure that each WLEi, viewed as a
transformation of the query names of the stratum, is monotonically
increasing. According to Tarski’s fixed point theorem, this insures that
there is a fixed point. The General Rules use Kleene’s fixed point
theorem to define a sequence that converges to the minimal fixed
point.

I don't know any of the underlying math here, but if you've got a
join of multiple recursion references then the number of output
rows could certainly be nonlinear, which very possibly breaks the
argument that there's a well-defined interpretation.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-03-23 19:35:59 Re: pg_upgrade failing for 200+ million Large Objects
Previous Message Jan Wieck 2021-03-23 19:22:04 Re: pg_upgrade failing for 200+ million Large Objects