Re: Schema variables - new implementation for Postgres 15

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Sergey Shinderuk <s(dot)shinderuk(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, dean(dot)a(dot)rasheed(at)gmail(dot)com, er(at)xs4all(dot)nl, joel(at)compiler(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org, Wolfgang Walther <walther(at)technowledgy(dot)de>
Subject: Re: Schema variables - new implementation for Postgres 15
Date: 2024-05-31 10:54:03
Message-ID: CAFj8pRCnMGCqhZ71FkuRk1T-PMUoorMZAk+2vnoNPA5BwKOOXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

pá 31. 5. 2024 v 11:46 odesílatel Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
napsal:

> > On Tue, May 28, 2024 at 05:18:02PM GMT, Pavel Stehule wrote:
> >
> > I propose another variants. First we can introduce pseudo function VAR(
> ).
> > The argument should be session variables. The name of this function can
> be
> > pgvar, globvar, ... We can talk about good name, it should not be too
> long,
> > but it is not important now. The VAR() function will be pseudo function
> > like COALESCE, so we can easily to set correct result type.
>
> So, the purpose of the function would be only to verify that the argument
> is a
> session variable? That seems to be a very light payload, which looks a bit
> awkward.
>

no, it just reduces catalog searching to variables. So with using this
function, then there is no possibility of collision between variables and
other objects. The argument can be only variable and nothing else. So then
the conflict is not possible. When somebody tries to specify a table or
column, then it fails, because this object will not be detected. So inside
this function, the tables and columns cannot to shading variables, and
variables cannot be replaced by columns.

So the proposed function is not just assert, it is designed like a catalog
filter.

> Out of those options you propose I think the first one is the
> most straightforward one, but...
>
> > Alvaro Herrera:
> > > Perhaps the solution to all this is to avoid having the variables be
> > > implicitly present in the range table of all queries. Instead, if you
> > > need a variable's value, then you need to add the variable to the FROM
> > > clause;
>
> The more I think about this, the more I like this solution. Marking
> which variables are available to the query this way, and using established
> patterns for resolving ambiguity actually looks intuitive to me. Now I
> know,
> you've got strong objections:
>

I still don't like this - mainly from two reasons

1. it doesn't look user friendly - you need to maintain two different
places in one query for one object. I can imagine usage there in the case
of composite variables with unpacking (and then it can be consistent with
others). I can imagine to use optional usage of variables there for the
possibility of realiasing - like functions - and if we should support it,
then with unpacking of composite values.

(2024-05-31 12:33:57) postgres=# create type t as (a int, b int);
CREATE TYPE
(2024-05-31 12:35:26) postgres=# create function fx() returns t as $$
select 1, 2 $$ language sql;
CREATE FUNCTION
(2024-05-31 12:35:44) postgres=# select fx();
┌───────┐
│ fx │
╞═══════╡
│ (1,2) │
└───────┘
(1 row)

(2024-05-31 12:35:47) postgres=# select * from fx();
┌───┬───┐
│ a │ b │
╞═══╪═══╡
│ 1 │ 2 │
└───┴───┘
(1 row)

2. But my main argument is, it is not really safe - it solves Peter's use
case, but if I use a reverse example of Peter's case, I still have a
problem.

I can have a variable x, and then I can write query like `SELECT x FROM x`;

but if somebody creates table x(x int), then the query `SELECT x FROM x`
will be correct, but it is surely something else. So the requirement of the
usage variable inside FROM clause doesn't help. It doesn't work.

> > I don't like this. Sure, this fixes the problem with collisions, but then
> > we cannot talk about variables. When some is used like a table, then it
> > should be a table. I can imagine memory tables, but it is a different
> type
> > of object. Table is relation, variable is just value. Variables should
> not
> > have columns, so using the same patterns for tables and variables has no
> > sense. Using the same catalog for variables and tables. Variables just
> hold
> > a value, and then you can use it inside a query without necessity to
> write
> > JOIN. Variables are not tables, and then it is not too confusing so they
> > are not transactional and don't support more rows, more columns.
>
> A FROM clause could contain a function returning a single value, nobody
> finds it confusing. And at least to me it's not much different from having
> a
> session variable as well, what do you think?
>

but there is a difference when function returns composite, and when not -
if I use function in FROM clause, I'll get unpacked columns, when I use
function in columns, then I get composite.

The usage variable in FROM clause can have sense in similar princip like
functions - for possibility to use alias in same level of query and
possibility to use one common syntax for composite unpacking. But it
doesn't help with safety against collisions.

>
> > c) using variables with necessity to define it in FROM clause. It is
> safe,
> > but it can be less readable, when you use more variables, and it is not
> too
> > readable, and user friendly, because you need to write FROM. And can be
> > messy, because you usually will use variables in queries, and it is
> > introduce not relations into FROM clause. But I can imagine this mode as
> > alternative syntax, but it is very unfriendly and not intuitive (I
> think).
>
> The proposal from Wolfgang to have a short-cut and not add FROM in
> case there
> is no danger of ambiguity seems to resolve that.
>
> > More probably it doesn't fast execution in simple expression execution
> mode.
>
> Could you elaborate more, what do you mean by that? If the performance
> overhead is not prohibitive (which I would expect is the case), having
> better
> UX for a new feature usually beats having better performance.
>

PLpgSQL has a special mode for faster expression execution. One
prerequisite is not using FROM clause.

> > It looks odd - It is not intuitive, it introduces new inconsistency
> inside
> > Postgres, or with solutions in other databases. No other database has a
> > similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird will
> be
> > confused. Users that use PL/pgSQL will be confused.
>
> Session variables are not part of the SQL standard, and maintaining
> consistency with other databases is a questionable goal. Since it's a new
> feature, I'm not sure what you mean by inconsistency inside Postgres
> itself.
>
> I see that the main driving case behind this patch is to help with
> migrating from other databases that do have session variables. Going with
> variables in FROM clause, will not make a migration much harder -- some of
> the
> queries would have to modify the FROM part, and that's it, right? I could
> imagine it would be even easier than adding VAR() everywhere.
>

I don't think - VAR(x) instead x is just a simple replacement - searching
related FROM clauses is much more complex work.

and if we talk about safety against collisions, then FROM clause doesn't
help. Moreover, this safety is not guaranteed today because we have a
search patch and we support unqualified identifiers.

Regards

Pavel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Wolfgang Walther 2024-05-31 11:10:43 Re: Schema variables - new implementation for Postgres 15
Previous Message Aleksander Alekseev 2024-05-31 10:34:17 Re: meson "experimental"?