Re: generic plans and "initial" pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: generic plans and "initial" pruning
Date: 2024-09-19 08:39:51
Message-ID: CA+HiwqGBpw_JNwkwZjQ2YaqTWrDjn9L5jpuc+nS8=a55SPD+UA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 17, 2024 at 9:57 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Thu, Aug 29, 2024 at 10:34 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > One idea that I think might be worth trying to reduce the footprint of
> > 0003 is to try to lock the prunable relations in a step of InitPlan()
> > separate from ExecInitNode(), which can be implemented by doing the
> > initial runtime pruning in that separate step. That way, we'll have
> > all the necessary locks before calling ExecInitNode() and so we don't
> > need to sprinkle the CachedPlanStillValid() checks all over the place
> > and worry about missed checks and dealing with partially initialized
> > PlanState trees.
>
> I've worked on this and found that it results in a much simpler design.
>
> Attached are 0001 and 0002, which contain patches to refactor the
> runtime pruning code. These changes move initial pruning outside of
> ExecInitNode() and use the results during ExecInitNode() to determine
> the set of child subnodes to initialize.
>
> With that in place, the patches (0003, 0004) that move the locking of
> prunable relations from plancache.c into the executor becomes simpler.
> It no longer needs to modify any code called by ExecInitNode(). Since
> no new locks are taken during ExecInitNode(), I didn't have to worry
> about changing all the code involved in PlanState tree initialization
> to add checks for CachedPlan validity. The check is only needed after
> performing initial pruning, and if the CachedPlan is invalid,
> ExecInitNode() won’t be called in the first place.

Sorry, I had missed merging some hunks into 0002 that fixed obsolete
comments. Fixed in the attached v54.

Regarding 0002, I was a bit bothered by the need to add a new function
just to iterate over the PartitionPruningDatas and the
PartitionedRelPruningData they contain, solely to initialize the
PartitionPruneContext needed for exec pruning. To address this, I
propose 0003, which moves the initialization of those contexts to be
done "lazily" in find_matching_subplan_recurse(), where they are
actually used. To make this work, I added an is_valid flag to
PartitionPruneContext, which is checked as follows in the code block
where it's initialized:

+ if (unlikely(!pprune->exec_context.is_valid))

I didn't notice any overhead of adding this to
find_matching_partitions_recurse() which is called for every instance
of exec pruning, so I think it's worthwhile to consider 0003.

I realized that I had missed considering, in the
delay-locking-to-executor patch (now 0004), that there may be plan
objects belonging to pruned partitions, such as RowMarks and
ResultRelInfos, which should not be initialized.
ExecGetRangeTableRelation() invoked with the RT indexes in these
objects would cause crashes in Assert builds since the pruned
partitions would not have been locked. I've updated the patch to
ignore RowMarks and result relations (in ModifyTable.resultRelations)
for pruned child relations, which required adding more accounting info
to EState to store the bitmapset of unpruned RT indexes. For
ResultRelInfos, I took the approach of memsetting them to 0 for pruned
result relations and adding checks at multiple sites to ensure the
ResultRelInfo being handled is valid. I recall previously proposing
lazy initialization for these objects when first needed [1], which
would make the added code unnecessary, but I might save that for
another time.

--
Thanks, Amit Langote
[1] https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp

Attachment Content-Type Size
v54-0003-Defer-locking-of-runtime-prunable-relations-to-e.patch application/x-patch 38.3 KB
v54-0002-Perform-runtime-initial-pruning-outside-ExecInit.patch application/x-patch 17.2 KB
v54-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch application/x-patch 19.9 KB
v54-0004-Handle-CachedPlan-invalidation-in-the-executor.patch application/x-patch 55.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-09-19 08:40:57 Re: Pgoutput not capturing the generated columns
Previous Message Alvaro Herrera 2024-09-19 08:26:00 Re: not null constraints, again