Re: generic plans and "initial" pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: generic plans and "initial" pruning
Date: 2024-08-29 13:34:17
Message-ID: CA+HiwqH9u1RWn9OEa=VQQpJagB0hDLCY+=fSyBC4ZkeU6Gg2HA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 23, 2024 at 9:48 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Wed, Aug 21, 2024 at 10:10 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > On Wed, Aug 21, 2024 at 8:45 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > * The replanning aspect of the lock-in-the-executor design would be
> > > simpler if a CachedPlan contained the plan for a single query rather
> > > than a list of queries, as previously mentioned. This is particularly
> > > due to the requirements of the PORTAL_MULTI_QUERY case. However, this
> > > option might be impractical.
> >
> > It might be, but maybe it would be worth a try? I mean,
> > GetCachedPlan() seems to just call pg_plan_queries() which just loops
> > over the list of query trees and does the same thing for each one. If
> > we wanted to replan a single query, why couldn't we do
> > fake_querytree_list = list_make1(list_nth(querytree_list, n)) and then
> > call pg_plan_queries(fake_querytree_list)? Or something equivalent to
> > that. We could have a new GetCachedSinglePlan(cplan, n) to do this.
>
> I've been hacking to prototype this, and it's showing promise. It
> helps make the replan loop at the call sites that start the executor
> with an invalidatable plan more localized and less prone to
> action-at-a-distance issues. However, the interface and contract of
> the new function in my prototype are pretty specialized for the replan
> loop in this context—meaning it's not as general-purpose as
> GetCachedPlan(). Essentially, what you get when you call it is a
> 'throwaway' CachedPlan containing only the plan for the query that
> failed during ExecutorStart(), not a plan integrated into the original
> CachedPlanSource's stmt_list. A call site entering the replan loop
> will retry the execution with that throwaway plan, release it once
> done, and resume looping over the plans in the original list. The
> invalid plan that remains in the original list will be discarded and
> replanned in the next call to GetCachedPlan() using the same
> CachedPlanSource. While that may sound undesirable, I'm inclined to
> think it's not something that needs optimization, given that we're
> expecting this code path to be taken rarely.
>
> I'll post a version of a revamped locks-in-the-executor patch set
> using the above function after debugging some more.

Here it is.

0001 implements changes to defer the locking of runtime-prunable
relations to the executor. The new design introduces a bitmapset
field in PlannedStmt to distinguish at runtime between relations that
are prunable whose locking can be deferred until ExecInitNode() and
those that are not and must be locked in advance. The set of prunable
relations can be constructed by looking at all the PartitionPruneInfos
in the plan and checking which are subject to "initial" pruning steps.
The set of unprunable relations is obtained by subtracting those from
the set of all RT indexes. This design gets rid of one annoying
aspect of the old design which was the need to add specialized fields
to store the RT indexes of partitioned relations that are not
otherwise referenced in the plan tree. That was necessary because in
the old design, I had removed the function AcquireExecutorLocks()
altogether to defer the locking of all child relations to execution.
In the new design such relations are still locked by
AcquireExecutorLocks().

0002 is the old patch to make ExecEndNode() robust against partially
initialized PlanState nodes by adding NULL checks.

0003 is the patch to add changes to deal with the CachedPlan becoming
invalid before the deferred locks on prunable relations are taken.
I've moved the replan loop into a new wrapper-over-ExecutorStart()
function instead of having the same logic at multiple sites. The
replan logic uses the GetSingleCachedPlan() described in the quoted
text. The callers of the new ExecutorStart()-wrapper, which I've
dubbed ExecutorStartExt(), need to pass the CachedPlanSource and a
query_index, which is the index of the query being executed in the
list CachedPlanSource.query_list. They are needed by
GetSingleCachedPlan(). The changes outside the executor are pretty
minimal in this design and all the difficulties of having to loop back
to GetCachedPlan() are now gone. I like how this turned out.

One idea that I think might be worth trying to reduce the footprint of
0003 is to try to lock the prunable relations in a step of InitPlan()
separate from ExecInitNode(), which can be implemented by doing the
initial runtime pruning in that separate step. That way, we'll have
all the necessary locks before calling ExecInitNode() and so we don't
need to sprinkle the CachedPlanStillValid() checks all over the place
and worry about missed checks and dealing with partially initialized
PlanState trees.

--
Thanks, Amit Langote

Attachment Content-Type Size
v51-0003-Handle-CachedPlan-invalidation-in-the-executor.patch application/octet-stream 84.7 KB
v51-0001-Defer-locking-of-runtime-prunable-relations-to-e.patch application/octet-stream 31.1 KB
v51-0002-Assorted-tightening-in-various-ExecEnd-routines.patch application/octet-stream 31.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-08-29 13:35:49 Re: Virtual generated columns
Previous Message Bharath Rupireddy 2024-08-29 13:32:15 Re: Switching XLog source from archive to streaming when primary available