Re: Using Expanded Objects other than Arrays from plpgsql

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Using Expanded Objects other than Arrays from plpgsql
Date: 2025-01-26 17:04:15
Message-ID: 38A31221-C1C4-4846-9709-D66ACD76E87A@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

> On 26 Jan 2025, at 20:37, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> And the coverage of newly invented mark_stmt() 42.37%. Some of branches are easy noops, but some are not.
>
> Yeah. I'm not too concerned about that because it's pretty much a
> copy-and-paste of the adjacent code. Maybe we should think about
> some way of refactoring pl_funcs.c to reduce duplication, but I
> don't have any great ideas about how.

OK, now I got it. The whole purpose of 2nd patch is to do
if (expr->target_param >= 0) expr->target_is_local = bms_is_member(expr->target_param, local_dnos);
to local variables.

>
>> expr_is_assignment_source() is named like if it should return nool, but it's void.
>
> I've been less than satisfied with that name too. I intended it
> as a statement of fact, "this expression has been found to be
> the source of an assignment". But it does seem confusing.
> Maybe we should recast it as an action. What do you think of
> "mark_expr_as_assignment_source"?

Sounds better to me. I found no examples of similar functions nether in pl_gram.y, nor in gram.y, so IMO mark_expr_as_assignment_source() is the best candidate.

>
>> I could not grasp from reading the code one generic question about new optimization rule. What cost does checking for possible in-place update incurs to code cannot have this optimization? Is it O(numer_of_arguments) of for every assignment execution?
>
> No, the extra effort is incurred at most once per assignment statement
> per session. (Unless the plpgsql function's cache entry gets
> invalidated, in which case we'd rebuild all of the function's data
> structures and have to redo this work too.)

OK, I think execution benefits justify this preparatory costs.

> We set up the evaluation
> function "paramfunc" as plpgsql_param_eval_var_check if we think we
> might be able to apply this optimization, or plpgsql_param_eval_var_ro
> if we don't think so but the variable is of varlena type. At runtime,
> if the variable's current value is not actually expanded, then
> plpgsql_param_eval_var_check falls through doing essentially the same
> work as plpgsql_param_eval_var_ro, so there should be no added cost.
> The first time we observe that the value *is* expanded, we incur the
> cost to detect whether an optimization is really possible, and then
> we change the "paramfunc" pointer to be the appropriate function
> so as to apply the optimization or not without rechecking. So
> generally speaking, if we're considering a variable of a type that
> doesn't support expansion, there should be zero extra per-execution
> cost. There is some extra cost at function compilation time to
> determine which expressions are assignment sources (but we were doing
> that already) and to discover whether those assignments are to
> nonlocal variables (which is new work, but only needs to be done in
> functions with exception blocks).

Got it, many thanks for the explanation.

But I've got some new questions:

I'm lost in internals of ExprEvalStep. But void *paramarg and his friend void *paramarg2 are cryptic. They always have same type and same meaning, but have very generic names.

I wonder if you plan similar optimizations for array_cat(), array_remove() etc?

+ a := a || a; -- not optimizable

Why is it not optimizable? Because there is no support function, because array_cat() has no support function, or something else?

Besides this, the patch looks good to me.

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2025-01-26 17:33:06 Re: FATAL: could not send data to WAL stream: lost synchronization with server: got message type "0", length 892351284
Previous Message Tom Lane 2025-01-26 15:37:05 Re: Using Expanded Objects other than Arrays from plpgsql

Browse pgsql-hackers by date

  From Date Subject
Next Message Yura Sokolov 2025-01-26 18:09:32 Re: New process of getting changes into the commitfest app
Previous Message Srinath Reddy 2025-01-26 16:54:50 Re: Why we need to check for local buffers in BufferIsExclusiveLocked and BufferIsDirty?