Allow subfield references without parentheses

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Allow subfield references without parentheses
Date: 2024-12-12 12:23:56
Message-ID: 8bb3af8a-796c-440f-b775-d05437b75e6f@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This patch allows subfield references in column references without
parentheses, subject to certain condition. This implements (hopes to,
anyway) the rules from the SQL standard (since SQL99).

This has been requested a number of times over the years. [0] is a
recent discussion that has mentioned it.

Specifically, identifier chains of three or more items now have an
additional possible interpretation.

Before:

A.B.C: schema A, table B, column or function C
A.B.C.D: database A, schema B, table C, column or function D

Now additionally:

A.B.C: correlation A, column B, field C; like (A.B).C
A.B.C.D: correlation A, column B, field C, field D; like (A.B).C.D

Also, identifier chains longer than four items now have an analogous
interpretation. They had no possible interpretation before.

(Note that single identifiers and two-part identifiers are not affected
at all.)

The "correlation A" above must be an explicit alias, not just a table name.

If both possible interpretations apply, then an error is raised. (A
workaround is to change the alias used in the query.) Such errors
should be very rare in practice.

In [0] there was some light discussion about other possible behaviors in
case of conflicts. In any case, with this patch it's possible to
experiment with different possible behaviors, by just replacing the
conditional that errors by another action. I also studied ruleutils.c a
bit to see if there are any tweaks needed to support this. So far it
seems okay. I'm sure we can come up with some pathological cases, but
so far I haven't done anything about it.

I left a couple of TODO notes in the patch such as where documentation
should be updated, and I didn't do anything about SQL and PL/pgSQL
parameters so far. Also, I tried to weave the additional code into
transformColumnRef() in a way that doesn't move much existing code
around, but eventually this should probably be reorganized a bit to
reduce duplication.

Another thing to think about would be the exact phrasing of any error
messages. Right now, transformColumnRef() assumes that a given
identifier chain can only have one possible interpretation and if it
doesn't find the thing the error says "didn't find the thing". But now
if there are multiple possible interpretations, it should probably say
something more like "didn't find this and also not that" or "didn't find
anything that matches that" or some other variant. I mean, what it does
now isn't bad, but given the amount of attention we have put into the
fine-tuning of these specific errors in the past, some additional
changes might be desired.

[0]:
https://www.postgresql.org/message-id/flat/CAFiTN-uiwaogH-dbz-ARpUUQM%2BRQKdU2qmPh1WzM6gEyS8PVRA%40mail.gmail.com

Attachment Content-Type Size
v0-0001-Allow-subfield-references-without-parentheses.patch text/plain 13.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2024-12-12 12:48:47 Re: Wrong results with right-semi-joins
Previous Message Alvaro Herrera 2024-12-12 12:09:26 Re: [PoC] Reducing planning time when tables have many partitions