From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
Cc: | Erik Rijkers <er(at)xs4all(dot)nl>, Michael Paquier <michael(at)paquier(dot)xyz>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, DUVAL REMI <REMI(dot)DUVAL(at)cheops(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: proposal: schema variables |
Date: | 2024-08-15 05:55:15 |
Message-ID: | CAFj8pRDF-F5diNTXr206QfQFauFNK1vu_7NARXd0dtdwGTWLMQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
út 30. 7. 2024 v 21:46 odesílatel Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
napsal:
> This is my review of the second patch in the series.
>
> Again, I mostly changed code comments and documentation.
>
> Noteworthy remarks:
>
> - A general reminder: single line comments should start with a lower case
> letter and have to period in the end:
>
Should it be "have not to period in the end" ?
>
> /* this is a typical single line comment */
>
I fixed almost all without parts related to psql and executor - there
almost all current comments break the mentioned rule. So I use the style
used in these files. I can fix it (if you like it) - or it can be fixed
generally in a separated patch set.
> - Variable as parameter:
>
> CREATE VARIABLE var AS date;
> LET var = current_date;
> PREPARE stmt(date) AS SELECT $1;
> EXECUTE stmt(var);
> ERROR: paramid of PARAM_VARIABLE param is out of range
>
> Is that working as intended? I don't understand the error message.
>
fixed in 0002 patch (variables cannot be used as EXECUTE argument) and 0014
(enable usage variables as an argument of EXECUTE)
>
> Perhaps there is a bug in src/backend/executor/execExpr.c.
>
> - IdentifyVariable
>
> > --- a/src/backend/catalog/namespace.c
> > +++ b/src/backend/catalog/namespace.c
> > [...]
> > +/*
> > + * IdentifyVariable - try to find variable identified by list of
> names.
> > [...]
>
> The function comment doesn't make clear to me what the function does.
> Perhaps that is so confusing because the function seems to serve several
> purposes (during parsing, in ANALYZE, and to identify shadowed variables
> in a later patch).
>
> I rewrote the function comment, but didn't touch the code or code
> comments.
>
merged
>
> Perhaps part of the reason why this function is so complicated is that
> you support things like this:
>
> CREATE SCHEMA sch;
> CREATE TYPE cp AS (a integer, b integer);
> CREATE VARIABLE sch.v AS cp;
> LET sch.v = (1, 2);
> SELECT sch.v.b;
> b
> ═══
> 2
> (1 row)
>
> test=# SELECT test.sch.v.b;
> b
> ═══
> 2
> (1 row)
>
> We don't support that for tables:
>
> CREATE TABLE tab (c cp);
> INSERT INTO tab VALUES (ROW(42, 43));
> SELECT tab.c.b FROM tab;
> ERROR: cross-database references are not implemented: tab.c.b
>
> You have to use extra parentheses:
>
> SELECT (tab.c).b FROM tab;
> b
> ════
> 43
> (1 row)
>
> I'd say that this should be the same for session variables.
> What do you think?
>
I prefer the current state, but I don't have a very strong opinion about
it. It can save 115 lines of almost trivial code, but I think the user
experience can be much worse. Using composite types in tables is not too
common a pattern (and when I read some recommendations for Oracle, it is an
antipattern), but usage of composite variables is common (it can hold a
row). Moreover, we talked long about possible identifier's collisions, and
the pattern schema.variable is very good protection against possible
collisions. Probably the pattern catalog.schema.variable.field is not too
interesting but the support has 50 lines.
More, the plpgsql supports pattern label.variable.field, so can be little
bit unfriendly if session variables doesn't support "similar" pattern
create type t as (x int, y int);
do $$
<<lab1>>declare v t;
begin
v := (20,20); raise notice '%', lab1.v.x;
end $$;
NOTICE: 20
DO
> Code comments:
>
> - There are lots of variables declared at the beginning, but they are
> only
> used in code blocks further down. For clarity, you should declare a
> variable in the code block where it is used.
>
moved
>
> - + /*
> + * In this case, "a" is used as catalog name -
> check it.
> + */
> + if (strcmp(a, get_database_name(MyDatabaseId)) !=
> 0)
> + {
> + if (!noerror)
> + ereport(ERROR,
> +
> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("cross-database references
> are not implemented: %s",
> + NameListToString(names))));
> + }
> + else
> + {
>
> There needn't be an "else", since the ereport() will never return.
> Less indentation is better.
>
>
done
> - src/backend/commands/session_variable.c
>
> There is one comment that confuses me and that I did not edit:
>
> > +Datum
> > +GetSessionVariable(Oid varid, bool *isNull, Oid *typid)
> > +{
> > + SVariable svar;
> > +
> > + svar = get_session_variable(varid);
> > +
> > + /*
> > + * Although svar is freshly validated in this point, the
> svar->is_valid can
> > + * be false, due possible accepting invalidation message inside
> domain
> > + * check. Now, the validation is done after lock, that can also
> accept
> > + * invalidation message, so validation should be trustful.
> > + *
> > + * For now, we don't need to repeat validation. Only svar should
> be valid
> > + * pointer.
> > + */
>
This comment is related to assertions. Before I had there
`Assert(svar->is_valid)`, because I expected it. But it was not always
true. And although it is true, we don't need to validate a variable,
because at this moment, the variable should be locked, and then we can
return content safely.
> > + Assert(svar);
> > +
> > + *typid = svar->typid;
> > +
> > + return copy_session_variable_value(svar, isNull);
> > +}
>
>
> - src/backend/executor/execMain.c
>
> > @@ -196,6 +198,51 @@ standard_ExecutorStart(QueryDesc *queryDesc, int
> eflags)
> > Assert(queryDesc->sourceText != NULL);
> > estate->es_sourceText = queryDesc->sourceText;
> >
> > + /*
> > + * The executor doesn't work with session variables directly.
> Values of
> > + * related session variables are copied to dedicated array, and
> this array
> > + * is passed to executor.
> > + */
>
> Why? Is that a performance measure? The comment should explain that.
>
Session variables are volatile internally - some function that uses LET
statements can be called more times inside a query. This behavior is not
consistent with plpgsql's variables or external parameters. And if we want
to support parallel queries, then volatile session variables can be pretty
messy (and unusable). If we want the same behaviour for queries with or
without parallel support, then the session variables should be "stable"
(like stable functions). Simple implementation is using "snapshot" of
values of used session variables when query is started. This "snapshot" is
immutable, so we don't need to make a copy more times.
I changed this comment
<-->/*
<--> * The executor doesn't work with session variables directly. Values of
<--> * related session variables are copied to a dedicated array, and this
array
<--> * is passed to the executor. This array is stable "snapshot" of values
of used
<--> * session variables. There are three benefits of this strategy:
<--> *
<--> * - consistency with external parameters and plpgsql variables,
<--> *
<--> * - session variables can be parallel safe,
<--> *
<--> * - we don't need make fresh copy for any read of session variable
<--> * (this is necessary because the internally the session variable can
<--> * be changed inside query execution time, and then a reference to
<--> * previously returned value can be corrupted).
<--> */
> - parallel safety
>
> > --- a/src/backend/optimizer/util/clauses.c
> > +++ b/src/backend/optimizer/util/clauses.c
> > @@ -935,6 +936,13 @@ max_parallel_hazard_walker(Node *node,
> max_parallel_hazard_context *context)
> > if (param->paramkind == PARAM_EXTERN)
> > return false;
> >
> > + /* We doesn't support passing session variables to workers */
> > + if (param->paramkind == PARAM_VARIABLE)
> > + {
> > + if (max_parallel_hazard_test(PROPARALLEL_RESTRICTED,
> context))
> > + return true;
> > + }
>
> Even if a later patch alleviates that restriction, this patch should
> document that
> session variables imply "parallel restricted". I have added that to
> doc/src/sgml/parallel.sgml.
>
merged (and removed in 0015)
>
> Attached are the first two patches with my edits (the first patch is
> unchanged;
> I just add it again to keep the cfbot happy.
>
> I hope to get to review the other patches at some later time.
>
> Yours,
> Laurenz Albe
>
Attached new patchset
Regards
Pavel
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2024-08-15 06:27:01 | Re: Logical Replication of sequences |
Previous Message | Peter Eisentraut | 2024-08-15 05:14:26 | Re: Variable renaming in dbcommands.c |
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Samuel (Sam) | 2024-08-19 01:55:50 | Trying to understand why a query is filtering when there is a composite index |
Previous Message | Pavel Stehule | 2024-08-08 05:17:00 | Re: proposal: schema variables |