Re: proposal: schema variables

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Michael Paquier <michael(at)paquier(dot)xyz>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, DUVAL REMI <REMI(dot)DUVAL(at)cheops(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: proposal: schema variables
Date: 2024-08-27 06:52:27
Message-ID: CAFj8pRCy8NTG3ewOrH0KLQgrMpVMSzPjhoLKW=c1Jjz8QWNSKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

út 27. 8. 2024 v 8:15 odesílatel Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
napsal:

> On Thu, 2024-08-15 at 07:55 +0200, Pavel Stehule wrote:
> > út 30. 7. 2024 v 21:46 odesílatel Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
> napsal:
> > > - A general reminder: single line comments should start with a lower
> case
> > > letter and have to period in the end:
> >
> > Should it be "have not to period in the end" ?
>
> I made a typo. I mean "have *no* period in the end".
>
> > I fixed almost all without parts related to psql and executor - there
> almost all
> > current comments break the mentioned rule. So I use the style used in
> these files.
> > I can fix it (if you like it) - or it can be fixed generally in a
> separated patch set.
>
> Thanks. I also noticed that a lot of existing comments break that rule,
> so I think that it is fine if you stick with the way that the surrounding
> code does it.
>
> > > - Variable as parameter:
> > >
> > > CREATE VARIABLE var AS date;
> > > LET var = current_date;
> > > PREPARE stmt(date) AS SELECT $1;
> > > EXECUTE stmt(var);
> > > ERROR: paramid of PARAM_VARIABLE param is out of range
> > >
> > > Is that working as intended? I don't understand the error message.
> >
> > fixed in 0002 patch (variables cannot be used as EXECUTE argument) and
> 0014 (enable usage variables as an argument of EXECUTE)
>
> Thanks.
>
> > > > --- a/src/backend/catalog/namespace.c
> > > > +++ b/src/backend/catalog/namespace.c
> > > > [...]
> > > > +/*
> > > > + * IdentifyVariable - try to find variable identified by list of
> names.
> > > > [...]
> > >
> > > Perhaps part of the reason why this function is so complicated is
> that
> > > you support things like this:
> > >
> > > CREATE SCHEMA sch;
> > > CREATE TYPE cp AS (a integer, b integer);
> > > CREATE VARIABLE sch.v AS cp;
> > > LET sch.v = (1, 2);
> > > SELECT sch.v.b;
> > > b
> > > ═══
> > > 2
> > > (1 row)
> > >
> > > test=# SELECT test.sch.v.b;
> > > b
> > > ═══
> > > 2
> > > (1 row)
> > >
> > > We don't support that for tables:
> > >
> > > CREATE TABLE tab (c cp);
> > > INSERT INTO tab VALUES (ROW(42, 43));
> > > SELECT tab.c.b FROM tab;
> > > ERROR: cross-database references are not implemented: tab.c.b
> > >
> > > You have to use extra parentheses:
> > >
> > > SELECT (tab.c).b FROM tab;
> > > b
> > > ════
> > > 43
> > > (1 row)
> > >
> > > I'd say that this should be the same for session variables.
> > > What do you think?
> >
> > I prefer the current state, but I don't have a very strong opinion about
> it.
> > It can save 115 lines of almost trivial code, but I think the user
> experience
> > can be much worse. Using composite types in tables is not too common a
> > pattern (and when I read some recommendations for Oracle, it is an
> antipattern),
> > but usage of composite variables is common (it can hold a row). Moreover,
> > we talked long about possible identifier's collisions, and the pattern
> > schema.variable is very good protection against possible collisions.
> > Probably the pattern catalog.schema.variable.field is not too interesting
> > but the support has 50 lines.
> >
> > More, the plpgsql supports pattern label.variable.field, so can be
> little bit
> > unfriendly if session variables doesn't support "similar" pattern
>
> I see your point, and I'm fine with leaving it as it is.
>
> >
> >
> > > - src/backend/commands/session_variable.c
> > >
> > > There is one comment that confuses me and that I did not edit:
> > >
> > > > +Datum
> > > > +GetSessionVariable(Oid varid, bool *isNull, Oid *typid)
> > > > +{
> > > > + SVariable svar;
> > > > +
> > > > + svar = get_session_variable(varid);
> > > > +
> > > > + /*
> > > > + * Although svar is freshly validated in this point, the
> svar->is_valid can
> > > > + * be false, due possible accepting invalidation message
> inside domain
> > > > + * check. Now, the validation is done after lock, that can
> also accept
> > > > + * invalidation message, so validation should be trustful.
> > > > + *
> > > > + * For now, we don't need to repeat validation. Only svar
> should be valid
> > > > + * pointer.
> > > > + */
> >
> > This comment is related to assertions. Before I had there
> `Assert(svar->is_valid)`,
> > because I expected it. But it was not always true. And although it is
> true,
> > we don't need to validate a variable, because at this moment, the
> variable
> > should be locked, and then we can return content safely.
>
> I guess my main problem is the word "trustful". I don't recognize that
> word.
> Perhaps you can reword the comment along the lines of your above
> explanation.
>

I'll try to change it

>
> >
> > > - src/backend/executor/execMain.c
> > >
> > > > @@ -196,6 +198,51 @@ standard_ExecutorStart(QueryDesc *queryDesc,
> int eflags)
> > > > Assert(queryDesc->sourceText != NULL);
> > > > estate->es_sourceText = queryDesc->sourceText;
> > > >
> > > > + /*
> > > > + * The executor doesn't work with session variables directly.
> Values of
> > > > + * related session variables are copied to dedicated array,
> and this array
> > > > + * is passed to executor.
> > > > + */
> > >
> > > Why? Is that a performance measure? The comment should explain
> that.
> >
> > Session variables are volatile internally - some function that uses LET
> statements
> > can be called more times inside a query. This behavior is not consistent
> with
> > plpgsql's variables or external parameters. And if we want to support
> parallel
> > queries, then volatile session variables can be pretty messy (and
> unusable).
> > If we want the same behaviour for queries with or without parallel
> support,
> > then the session variables should be "stable" (like stable functions).
> > Simple implementation is using "snapshot" of values of used session
> variables
> > when query is started. This "snapshot" is immutable, so we don't need to
> make
> > a copy more times.
> >
> > I changed this comment
>
> Thanks.
>
> > > - parallel safety
> > >
> > > > --- a/src/backend/optimizer/util/clauses.c
> > > > +++ b/src/backend/optimizer/util/clauses.c
> > > > @@ -935,6 +936,13 @@ max_parallel_hazard_walker(Node *node,
> max_parallel_hazard_context *context)
> > > > if (param->paramkind == PARAM_EXTERN)
> > > > return false;
> > > >
> > > > + /* We doesn't support passing session variables to workers
> */
> > > > + if (param->paramkind == PARAM_VARIABLE)
> > > > + {
> > > > + if (max_parallel_hazard_test(PROPARALLEL_RESTRICTED,
> context))
> > > > + return true;
> > > > + }
> > >
> > > Even if a later patch alleviates that restriction, this patch should
> document that
> > > session variables imply "parallel restricted". I have added that to
> doc/src/sgml/parallel.sgml.
> > >
> >
> > merged (and removed in 0015)
>
> Thanks.
>
>
> I hope I can do some more review at some point in the future...
>
> I sincerely hope that this patch set gets merged at some point.
> The big obstacle is that it is so large. That's probably because it is
> pretty
> feature-complete (but, as we have found, not totally free of bugs).
>

Any feature that builds its own system catalog cannot be short. The first
two patches have 300KB, like all others and just basic support for reading
and writing. All other features have 300KB. On second hand almost all code
is simple, and there are no changes in critical parts of Postgres. The size
and complexity of JSONPath is level higher. I can throw 200KB from another
300KB patch set which can be better for review, but it can be harder to
maintain this patch set. I'll try it, and I'll send a reduced version.

> Judging from the amount of time I put into my review so far, I guess that
> this
> patch set would keep a committer busy for several weeks. Perhaps the only
> way to
> get this done would be to make you a committer...
>

:-)

Unfortunately, I am very bad at languages, I had a lot of problems in basic
school just with Czech lang, Russian and English was more terrible) - so I
very appreciate your work on this patch.

Thank you very much

Pavel

> Yours,
> Laurenz Albe
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-08-27 06:52:44 Re: Streaming read-ready sequential scan code
Previous Message chungui.wcg 2024-08-27 06:43:51 Re:allowing extensions to control planner behavior