Re: Improving default column names/aliases of subscript text expressions

From: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improving default column names/aliases of subscript text expressions
Date: 2024-12-16 17:33:29
Message-ID: CAGECzQQEMNSFGmb_b-E7+=J_KuZUNgEdr2cMn=fsJ1yib8HekQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 16 Dec 2024 at 16:34, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I dunno, this seems to be putting an undue amount of emphasis on
> one very specific usage pattern. Why should it matter whether
> the subscripts are string literals or not? What will happen when
> ruleutils decides to dump those expressions with the implicit
> cast to text being less implicit?

Thanks for the quick response. To be clear, the attached patch was
just a quick POC, it definitely needs more work if it's the direction
I should pursue. Indeed adding support for explicit casts seems like a
nice thing.

> regression=# create view v1 as
> SELECT data['a'] FROM tj;
> CREATE VIEW
> regression=# \d+ v1
> View "public.v1"
> Column | Type | Collation | Nullable | Default | Storage | Description
> --------+-------+-----------+----------+---------+----------+-------------
> data | jsonb | | | | extended |
> View definition:
> SELECT data['a'::text] AS data
> FROM tj;

Are you sure you ran this with my patch? When I do this it actually
fills in the "better" alias just fine, which makes sense because when
the view query is parsed it doesn't have the cast yet. The cast only
gets added by ruleutils, which also adds the explicit alias.

> create view v1 as SELECT data['a'] FROM tj;
CREATE VIEW
> \d+ v1;
View "public.v1"
Column │ Type │ Collation │ Nullable │ Default │ Storage │ Description
────────┼───────┼───────────┼──────────┼─────────┼──────────┼─────────────
a │ jsonb │ │ │ │ extended │
View definition:
SELECT data['a'::text] AS a
FROM tj;

> I'm also wondering how such a rule interacts with string literals
> that don't get resolved as text:
>
> regression=# create table t2 (data int[]);
> CREATE TABLE
> regression=# insert into t2 values(array[1,2,3,4]);
> INSERT 0 1
> regression=# select data['2'], data[3] from t2;
> data | data
> ------+------
> 2 | 3
> (1 row)

So what would you want here? Do you want these columns to be called 2
and 3? That shouldn't be too difficult to implement, but that seemed
like it would possibly go a bit too far.

> This would be a GUC that changes query semantics, which is a
> thing we've learned the hard way is evil. (It would not be
> any less evil in one of the other implementation approaches.)

Hmm, okay. I didn't know that.

> I think that to do something like this, we'd have to make a
> considered decision that the new way is so much better than
> the old that it's okay to break some people's queries.
> I don't say that bar is unreachable, but it seems high.

Totally fair, that's part of the reason I limited this to only string
literals, to effectively only impact JSON (which only had subscript
support since PG14) and not arrays.

One thing that I didn't see you explicitly say: Do you agree that the
new column names are actually better than the old ones?

> One way to lower the bar would be to make the change affect
> fewer cases, like maybe only JSON subscripts. That points
> towards your idea of making the subscript implementation
> responsible. However, if memory serves we pick the column
> aliases before doing parse analysis, which'd make it hard
> to know which data type is involved. I do not really recall
> why it's done that way or whether it'd be practical to change.

Oh, that does sound annoying indeed. I'll take a look at how hard that
would be to change.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-12-16 17:49:22 Re: Add CASEFOLD() function.
Previous Message Peter Geoghegan 2024-12-16 17:31:47 Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?