Re: SQLFunctionCache and generic plans

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject: Re: SQLFunctionCache and generic plans
Date: 2025-01-17 18:27:36
Message-ID: 2275997.1737138456@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> writes:
> I've rebased patch on master. Tests pass here.

The cfbot still doesn't like it; my guess is that you built without
--with-libxml and so didn't notice the effects on xml.out.

I've looked through the patch briefly and have a few thoughts:

* You cannot use plancache.c like this:

plansource = CreateCachedPlan(NULL, fcache->src, CreateCommandTag((Node *)parsetree));

CreateCachedPlan supposes that raw_parse_tree == NULL means an empty
query. Now aside from the fact that you didn't update the comment
documenting that, this won't work, because RevalidateCachedQuery
also knows that raw_parse_tree == NULL means an empty query. If an
invalidation happens while the function is running, revalidation
will produce an empty plan list, causing failure. (It seems like
a regression test case exercising such invalidation is called for.)

I think the only way you can really make this work is to have two
kinds of cached plans: the current kind where the starting point
is a RawStmt, and a new kind where the starting point is an
analyzed-but-not-rewritten Query tree (or maybe a list of those,
but if you can hold it to one tree things will be simpler).
I'd be inclined to mechanize that by adding a field named something
like "Query *analyzed_parse_tree" to CachedPlanSource and inventing a
new creation function that parallels CreateCachedPlan but takes an
analyzed Query instead of a RawStmt. Then you'll need to fix
RevalidateCachedQuery so that if analyzed_parse_tree isn't NULL then
it ignores raw_parse_tree and proceeds straight to rewriting the
already-analyzed Query. You'll need to check all the other places
that touch CachedPlanSource.raw_parse_tree, but I think all the other
required updates are pretty obvious. (Do not omit updating the
comments in plancache.h and plancache.c that describe all this.)

* I'm not very convinced by the new logic in
StmtPlanRequiresRevalidation. Is it really the case that "check for
CMD_UTILITY" is sufficient? Even if the coding is okay, the comment
needs work to provide a better explanation why it's okay.

* I'd say lose the enable_sql_func_custom_plans GUC. We don't have
anything comparable for plpgsql and there's been (relatively) little
complaint about that. Aside from adding clutter to the patch, it
contorts the logic in functions.c because it forces you to support
two code paths.

* The regression output changes like

-CONTEXT: SQL function "regexp_match" statement 1
+CONTEXT: SQL function "regexp_match" during startup

seem fairly unfortunate. Aside from making maintenance of the patch
painful, the changed messages provide strictly less information to the
user. I would try hard to eliminate that behavioral change. I think
you could do so, or at least greatly reduce the number of diffs, by
having init_sql_fcache perform only raw parsing (producing a list of
RawStmts, or extracting a list of Queries from prosqlbody) and
delaying creation of a plan for a particular statement until you first
reach execution of that statement. This would be a better way all
around because it'd also fix the anomalies I mentioned upthread for
cases where a DDL statement earlier in the function should affect
parse analysis of a later statement.

* release_plans seems kind of wrong: isn't it giving up most of
the benefit of doing this? plpgsql doesn't seem to release plans
unless it's forced to revalidate. Why should SQL functions
behave differently? Also, I'm pretty sure it's wrong that
you have
+ ReleaseCachedPlan(cplan, NULL);
with no ResourceOwner tracking the reference. That probably
means that the code leaks cached plans on error. For the
current patch iteration with the function's data structure only
meant to live as long as the query, it should be sufficient to
use CurrentResourceOwner to manage these cplan references.

* The patch is hard to read in functions.c because there's a mix of
code-motion and actual logic changes that are touching the same
places. Perhaps it'd be worth splitting it into a series of two
patches: the first one just does code motion such as pushing existing
logic into a new subroutine, and then the second one has the logic
changes. Or maybe that wouldn't help much, but consider giving it a
try.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-01-17 18:40:15 Re: Accept recovery conflict interrupt on blocked writing
Previous Message Adam Brusselback 2025-01-17 18:16:43 Re: Catching query cancelations in PLPython3u