Re: Improving default column names/aliases of subscript text expressions

From: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improving default column names/aliases of subscript text expressions
Date: 2024-12-16 19:05:39
Message-ID: CAGECzQRL6+0F5ej3f=LgDBzEsqr2gbaNvV1qmVEJXhDgrFUR8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 16 Dec 2024 at 19:32, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> No, sorry, I was just illustrating the behavior with HEAD.
> The important part of this is not the assigned alias
> but the visible cast.

Then I don't think I understand what you're trying to say. While I
think it would be good to not have an explicit cast impact the
explicit casts in a subscript as well, it does seem like a rather
niche edge case for people to write a query where that would matter.
If you're doing such explicit casts, you probably also want to set an
explicit alias. Case in point being ruleutils, where it explicitizes
both the implicit cast, and the implicit column alias.

> > So what would you want here? Do you want these columns to be called 2
> > and 3?
>
> No!!

Good, then we agree on that at least.

> > One thing that I didn't see you explicitly say: Do you agree that the
> > new column names are actually better than the old ones?
>
> No, I'm not at all convinced of that. For these examples
> I'd prefer something like "data_a", "data_b", etc.

I did consider that naming scheme as well, but there are a few reasons:
1. That same naming scheme holds just as well for fields of composite
types. It seems inconsistent to only do it for subscripts. Our logic
now is to take the last field name in a series of field names. My POC
patch basically extends that to be the last field name OR
string-literal subscript. (to be clear changing the default column
names for fields of composite types like this seems out of the
question to me with regards to the amount of impact).
2. There's a hard limit of 63 characters in a column name due to
NAMEDATALEN, so putting the whole path in there won't fit in case of
somewhat long subscript names.
3. Even if we'd have unlimited length, now you end up with common
prefixes if you select multiple fields.
4. For the custom type that I'm implementing the subscripting for, I
really don't want such a prefix.

> That approach might also make it more palatable to process integer
> literals this way (i.e. "data_2" etc), though I am not sure we want
> to do that because of the increased blast radius.

I agree, that would be nice, but I don't think that's worth the
additional blast radius.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2024-12-16 19:13:51 Re: IANA timezone abbreviations versus timezone_abbreviations
Previous Message Melanie Plageman 2024-12-16 18:49:47 Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?