Re: Improving default column names/aliases of subscript text expressions

From: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improving default column names/aliases of subscript text expressions
Date: 2024-12-16 22:48:23
Message-ID: CAGECzQSWuavuWu93MzhGyqODbtUqkb=hTysEitR12fYDfTJzxg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 16 Dec 2024 at 21:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Oh, well if you're willing to cheat like that, sure ;-). I was
> speaking of replacing the existing logic with something that looked
> only at the post-analysis tree.

Yeah, alright. That's not really something that I think we can do
without introducing some unintended behavioural changes.

> I dunno, this is so obviously a single-purpose kluge that it's hard
> to call it anything but a kluge. I'm not convinced this is better
> than writing out "SELECT data['a'] AS a, data['b'] AS b, ...".

To be clear, the benefit of not having to add the alias manually gets
more meaningful when subscripts are not a single character like
data['first_name'], then both the additional typing gets more
significant as well as the chance for accidental typos in the alias.

Regarding it being a single purpose kludge. I don't really see a big
problem with doing certain specific column naming on the transformed
expression, and if that doesn't find a good name then we fall back to
the battle-tested naming based on the untransformed expression. You
even said that previously people wanted to improve certain other
naming too using the transformed expression, so it sounds like it's
not even single-purpose. So instead of seeing this as a kludge, I'd
look at it as the only way forward from the current situation we're
in, without having to worry about breaking unintended things.

> In particular, it seems like what's going on here is that you
> are using extensible subscripting because that's what's available,
> but what you really wish you had is extensible field selection.
> If you could write "SELECT (data).a, (data).b, ..." then the
> existing FigureColname heuristics would do what you want already.

There's a lot of extensibility that I would like to have in the
parser. But yes, extensible field selection would be very nice for the
things I'm working on. Although those parentheses don't look very
user-friendly, but I guess [1] would probably resolve that.

> I know we kicked that idea around a little in the past, but
> nobody has looked into it seriously.

Yeah, that's definitely an area I plan to look more seriously into
soon-ish. But that's a much bigger project, and this seemed like a
fairly easy win, both for what I'm working on[2] and for the built in
json indexing.

[1]: https://www.postgresql.org/message-id/8bb3af8a-796c-440f-b775-d05437b75e6f@eisentraut.org
[2]: https://github.com/duckdb/pg_duckdb

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2024-12-16 23:16:18 Re: Windows UTF8 system locale
Previous Message Peter Smith 2024-12-16 22:46:49 Re: Introduce XID age and inactive timeout based replication slot invalidation