Re: proposal: schema variables

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Michael Paquier <michael(at)paquier(dot)xyz>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, DUVAL REMI <REMI(dot)DUVAL(at)cheops(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: proposal: schema variables
Date: 2024-08-15 05:55:15
Message-ID: CAFj8pRDF-F5diNTXr206QfQFauFNK1vu_7NARXd0dtdwGTWLMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

út 30. 7. 2024 v 21:46 odesílatel Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
napsal:

> This is my review of the second patch in the series.
>
> Again, I mostly changed code comments and documentation.
>
> Noteworthy remarks:
>
> - A general reminder: single line comments should start with a lower case
> letter and have to period in the end:
>

Should it be "have not to period in the end" ?

>
> /* this is a typical single line comment */
>

I fixed almost all without parts related to psql and executor - there
almost all current comments break the mentioned rule. So I use the style
used in these files. I can fix it (if you like it) - or it can be fixed
generally in a separated patch set.

> - Variable as parameter:
>
> CREATE VARIABLE var AS date;
> LET var = current_date;
> PREPARE stmt(date) AS SELECT $1;
> EXECUTE stmt(var);
> ERROR: paramid of PARAM_VARIABLE param is out of range
>
> Is that working as intended? I don't understand the error message.
>

fixed in 0002 patch (variables cannot be used as EXECUTE argument) and 0014
(enable usage variables as an argument of EXECUTE)

>
> Perhaps there is a bug in src/backend/executor/execExpr.c.
>
> - IdentifyVariable
>
> > --- a/src/backend/catalog/namespace.c
> > +++ b/src/backend/catalog/namespace.c
> > [...]
> > +/*
> > + * IdentifyVariable - try to find variable identified by list of
> names.
> > [...]
>
> The function comment doesn't make clear to me what the function does.
> Perhaps that is so confusing because the function seems to serve several
> purposes (during parsing, in ANALYZE, and to identify shadowed variables
> in a later patch).
>
> I rewrote the function comment, but didn't touch the code or code
> comments.
>

merged

>
> Perhaps part of the reason why this function is so complicated is that
> you support things like this:
>
> CREATE SCHEMA sch;
> CREATE TYPE cp AS (a integer, b integer);
> CREATE VARIABLE sch.v AS cp;
> LET sch.v = (1, 2);
> SELECT sch.v.b;
> b
> ═══
> 2
> (1 row)
>
> test=# SELECT test.sch.v.b;
> b
> ═══
> 2
> (1 row)
>
> We don't support that for tables:
>
> CREATE TABLE tab (c cp);
> INSERT INTO tab VALUES (ROW(42, 43));
> SELECT tab.c.b FROM tab;
> ERROR: cross-database references are not implemented: tab.c.b
>
> You have to use extra parentheses:
>
> SELECT (tab.c).b FROM tab;
> b
> ════
> 43
> (1 row)
>
> I'd say that this should be the same for session variables.
> What do you think?
>

I prefer the current state, but I don't have a very strong opinion about
it. It can save 115 lines of almost trivial code, but I think the user
experience can be much worse. Using composite types in tables is not too
common a pattern (and when I read some recommendations for Oracle, it is an
antipattern), but usage of composite variables is common (it can hold a
row). Moreover, we talked long about possible identifier's collisions, and
the pattern schema.variable is very good protection against possible
collisions. Probably the pattern catalog.schema.variable.field is not too
interesting but the support has 50 lines.

More, the plpgsql supports pattern label.variable.field, so can be little
bit unfriendly if session variables doesn't support "similar" pattern

create type t as (x int, y int);
do $$
<<lab1>>declare v t;
begin
v := (20,20); raise notice '%', lab1.v.x;
end $$;
NOTICE: 20
DO

> Code comments:
>
> - There are lots of variables declared at the beginning, but they are
> only
> used in code blocks further down. For clarity, you should declare a
> variable in the code block where it is used.
>

moved

>
> - + /*
> + * In this case, "a" is used as catalog name -
> check it.
> + */
> + if (strcmp(a, get_database_name(MyDatabaseId)) !=
> 0)
> + {
> + if (!noerror)
> + ereport(ERROR,
> +
> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("cross-database references
> are not implemented: %s",
> + NameListToString(names))));
> + }
> + else
> + {
>
> There needn't be an "else", since the ereport() will never return.
> Less indentation is better.
>
>
done

> - src/backend/commands/session_variable.c
>
> There is one comment that confuses me and that I did not edit:
>
> > +Datum
> > +GetSessionVariable(Oid varid, bool *isNull, Oid *typid)
> > +{
> > + SVariable svar;
> > +
> > + svar = get_session_variable(varid);
> > +
> > + /*
> > + * Although svar is freshly validated in this point, the
> svar->is_valid can
> > + * be false, due possible accepting invalidation message inside
> domain
> > + * check. Now, the validation is done after lock, that can also
> accept
> > + * invalidation message, so validation should be trustful.
> > + *
> > + * For now, we don't need to repeat validation. Only svar should
> be valid
> > + * pointer.
> > + */
>

This comment is related to assertions. Before I had there
`Assert(svar->is_valid)`, because I expected it. But it was not always
true. And although it is true, we don't need to validate a variable,
because at this moment, the variable should be locked, and then we can
return content safely.

> > + Assert(svar);
> > +
> > + *typid = svar->typid;
> > +
> > + return copy_session_variable_value(svar, isNull);
> > +}
>
>

> - src/backend/executor/execMain.c
>
> > @@ -196,6 +198,51 @@ standard_ExecutorStart(QueryDesc *queryDesc, int
> eflags)
> > Assert(queryDesc->sourceText != NULL);
> > estate->es_sourceText = queryDesc->sourceText;
> >
> > + /*
> > + * The executor doesn't work with session variables directly.
> Values of
> > + * related session variables are copied to dedicated array, and
> this array
> > + * is passed to executor.
> > + */
>
> Why? Is that a performance measure? The comment should explain that.
>

Session variables are volatile internally - some function that uses LET
statements can be called more times inside a query. This behavior is not
consistent with plpgsql's variables or external parameters. And if we want
to support parallel queries, then volatile session variables can be pretty
messy (and unusable). If we want the same behaviour for queries with or
without parallel support, then the session variables should be "stable"
(like stable functions). Simple implementation is using "snapshot" of
values of used session variables when query is started. This "snapshot" is
immutable, so we don't need to make a copy more times.

I changed this comment

<-->/*
<--> * The executor doesn't work with session variables directly. Values of
<--> * related session variables are copied to a dedicated array, and this
array
<--> * is passed to the executor. This array is stable "snapshot" of values
of used
<--> * session variables. There are three benefits of this strategy:
<--> *
<--> * - consistency with external parameters and plpgsql variables,
<--> *
<--> * - session variables can be parallel safe,
<--> *
<--> * - we don't need make fresh copy for any read of session variable
<--> * (this is necessary because the internally the session variable can
<--> * be changed inside query execution time, and then a reference to
<--> * previously returned value can be corrupted).
<--> */

> - parallel safety
>
> > --- a/src/backend/optimizer/util/clauses.c
> > +++ b/src/backend/optimizer/util/clauses.c
> > @@ -935,6 +936,13 @@ max_parallel_hazard_walker(Node *node,
> max_parallel_hazard_context *context)
> > if (param->paramkind == PARAM_EXTERN)
> > return false;
> >
> > + /* We doesn't support passing session variables to workers */
> > + if (param->paramkind == PARAM_VARIABLE)
> > + {
> > + if (max_parallel_hazard_test(PROPARALLEL_RESTRICTED,
> context))
> > + return true;
> > + }
>
> Even if a later patch alleviates that restriction, this patch should
> document that
> session variables imply "parallel restricted". I have added that to
> doc/src/sgml/parallel.sgml.
>

merged (and removed in 0015)

>
> Attached are the first two patches with my edits (the first patch is
> unchanged;
> I just add it again to keep the cfbot happy.
>
> I hope to get to review the other patches at some later time.
>
> Yours,
> Laurenz Albe
>

Attached new patchset

Regards

Pavel

Attachment Content-Type Size
v20240815-0002-Storage-for-session-variables-and-SQL-interface.patch text/x-patch 147.1 KB
v20240815-0003-function-pg_session_variables-for-cleaning-tests.patch text/x-patch 4.6 KB
v20240815-0004-DISCARD-VARIABLES.patch text/x-patch 9.6 KB
v20240815-0005-memory-cleaning-after-DROP-VARIABLE.patch text/x-patch 22.4 KB
v20240815-0001-Enhancing-catalog-for-support-session-variables-and-.patch text/x-patch 132.2 KB
v20240815-0006-plpgsql-tests.patch text/x-patch 16.9 KB
v20240815-0007-GUC-session_variables_ambiguity_warning.patch text/x-patch 13.9 KB
v20240815-0009-PREPARE-LET-support.patch text/x-patch 7.4 KB
v20240815-0008-EXPLAIN-LET-support.patch text/x-patch 8.3 KB
v20240815-0010-implementation-of-temporary-session-variables.patch text/x-patch 39.3 KB
v20240815-0011-Implementation-ON-TRANSACTION-END-RESET-clause.patch text/x-patch 14.6 KB
v20240815-0012-Implementation-of-DEFAULT-clause-default-expressions.patch text/x-patch 33.6 KB
v20240815-0013-Implementation-of-NOT-NULL-and-IMMUTABLE-clauses.patch text/x-patch 35.6 KB
v20240815-0014-allow-read-an-value-of-session-variable-directly-fro.patch text/x-patch 13.3 KB
v20240815-0015-allow-parallel-execution-queries-with-session-variab.patch text/x-patch 11.9 KB
v20240815-0016-plpgsql-implementation-for-LET-statement.patch text/x-patch 14.2 KB
v20240815-0017-expression-with-session-variables-can-be-inlined.patch text/x-patch 4.2 KB
v20240815-0019-transactional-variables.patch text/x-patch 39.2 KB
v20240815-0018-this-patch-changes-error-message-column-doesn-t-exis.patch text/x-patch 29.1 KB
v20240815-0020-pg_restore-A-variable.patch text/x-patch 2.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-08-15 06:27:01 Re: Logical Replication of sequences
Previous Message Peter Eisentraut 2024-08-15 05:14:26 Re: Variable renaming in dbcommands.c

Browse pgsql-performance by date

  From Date Subject
Next Message Stephen Samuel (Sam) 2024-08-19 01:55:50 Trying to understand why a query is filtering when there is a composite index
Previous Message Pavel Stehule 2024-08-08 05:17:00 Re: proposal: schema variables