Possible documentation inaccuracy in optimizer README

From: Zeyuan Hu <ferrishu3886(at)gmail(dot)com>
To: pgsql-docs(at)postgresql(dot)org
Subject: Possible documentation inaccuracy in optimizer README
Date: 2025-04-08 02:32:01
Message-ID: CACvHWmYFo+60yMqKJajDDvKN5EM41YHrCT3oxukwXmGAqpWvyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Hello,

In https://github.com/postgres/postgres/tree/master/src/backend/optimizer,
there are two examples on the dynamic programming (DP) algorithm used in
the optimizer, which I think
have some inaccuracy:

SELECT *
FROM tab1, tab2, tab3, tab4
WHERE tab1.col = tab2.col AND
tab2.col = tab3.col AND
tab3.col = tab4.col

Tables 1, 2, 3, and 4 are joined as:
{1 2},{2 3},{3 4}
{1 2 3},{2 3 4}
{1 2 3 4}
(other possibilities will be excluded for lack of join clauses)

SELECT *
FROM tab1, tab2, tab3, tab4
WHERE tab1.col = tab2.col AND
tab1.col = tab3.col AND
tab1.col = tab4.col

Tables 1, 2, 3, and 4 are joined as:
{1 2},{1 3},{1 4}
{1 2 3},{1 3 4},{1 2 4}
{1 2 3 4}

Both examples seem to ignore the transitivity imposed by the join clauses,
which creates a gap between the documentation here and what the actual
optimizer does.
For the first example, in the actual optimizer, {1,3},{1,4},{2,4} are also
considered. For the second example, {2,3}, {2,4}, {3,4} are also
considered.
The reason is that, for example 1, tab1.col = tab2.col AND tab2.col =
tab3.col imply that tab1.col = tab3.col due to transitivity, which leads to
the consideration of {1,3}.
The rest of the missing entries follow the same reasoning.

best
regards,

Zeyuan

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Tom Lane 2025-04-08 04:28:07 Re: Possible documentation inaccuracy in optimizer README
Previous Message Bruce Momjian 2025-04-08 01:34:06 Re: Parameter NOT NULL to CREATE DOMAIN not the same as CHECK (VALUE IS NOT NULL)