From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Vladimir Churyukin <vladimir(at)churyukin(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Improving inferred query column names |
Date: | 2023-02-23 04:03:48 |
Message-ID: | 341525.1677125028@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2023-02-22 16:38:51 -0500, Tom Lane wrote:
>> The proposal so far was just to handle a function call wrapped
>> around something else by converting to the function name followed
>> by whatever we'd emit for the something else.
> SELECT sum(relpages), sum(reltuples), 1+1 FROM pg_class;
> ┌──────────────┬───────────────┬──────────┐
> │ sum_relpages │ sum_reltuples │ ?column? │
> ├──────────────┼───────────────┼──────────┤
So far so good, but what about multi-argument functions?
Do we do "f_x_y_z", and truncate wherever? How well will this
work with nested function calls?
>> You cannot realistically
>> handle, say, operator expressions without emitting names that will
>> require quoting, which doesn't seem attractive.
> Well, it doesn't require much to be better than "?column?", which already
> requires quoting...
I think the point of "?column?" is to use something that nobody's going
to want to reference that way, quoted or otherwise. The SQL spec says
(in SQL:2021, it's 7.16 <query specification> syntax rule 18) that if the
column expression is anything more complex than a simple column reference
(or SQL parameter reference, which I think we don't support) then the
column name is implementation-dependent, which is standards-ese for
"here be dragons".
BTW, SQL92 and SQL99 had a further constraint:
c) Otherwise, the <column name> of the i-th column of the <query
specification> is implementation-dependent and different
from the <column name> of any column, other than itself, of
a table referenced by any <table reference> contained in the
SQL-statement.
We never tried to implement that literally, and now I'm glad we didn't
bother, because recent spec versions only say "implementation-dependent",
full stop. In any case, the spec is clearly in the camp of "don't depend
on these column names".
> We could just do something like printing <left>_<funcname>_<right>. So
> something like avg(reltuples / relpages) would end up as
> avg_reltuples_float48div_relpages.
> Whether that's worth it, or whether column name lengths would be too painful,
> IDK.
I think you'd soon be hitting NAMEDATALEN limits ...
>> And no, deduplication isn't on the table at all here.
> +1
I remembered while looking at the spec that duplicate column names
in SELECT output are not only allowed but *required* by the spec.
If you write, say, "SELECT 1 AS x, 2 AS x, ..." then the column
names of those two columns are both "x", no wiggle room at all.
So I see little point in trying to deduplicate generated names,
even aside from the points you made.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | John Naylor | 2023-02-23 04:12:56 | Re: pgindent vs. git whitespace check |
Previous Message | Peter Smith | 2023-02-23 02:07:38 | Re: "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes |