From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: generic plans and "initial" pruning |
Date: | 2024-12-12 07:58:54 |
Message-ID: | CA+HiwqHOejJk0_qMuM5g38h70hY_JvHMAKwnH3k=urfTXauPQA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Dec 9, 2024 at 4:10 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Fri, Dec 6, 2024 at 5:18 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Thu, Dec 5, 2024 at 11:07 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> > > On 12/5/24 12:28, Amit Langote wrote:
> > > > On Thu, Dec 5, 2024 at 3:53 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > >> On Thu, Dec 5, 2024 at 2:20 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> > > >>> Sure, changing the APIs is allowed, I'm just wondering if maybe there
> > > >>> might be a way to not have this issue, or at least notice the missing
> > > >>> call early.
> > > >>>
> > > >>> I haven't tried, wouldn't it be better to modify ExecutorStart() to do
> > > >>> the retries internally? I mean, the extensions wouldn't need to check if
> > > >>> the plan is still valid, ExecutorStart() would take care of that. Yeah,
> > > >>> it might need some new arguments, but that's more obvious.
> > > >>
> > > >> One approach could be to move some code from standard_ExecutorStart()
> > > >> into ExecutorStart(). Specifically, the code responsible for setting
> > > >> up enough state in the EState to perform ExecDoInitialPruning(), which
> > > >> takes locks that might invalidate the plan. If the plan does become
> > > >> invalid, the hook and standard_ExecutorStart() are not called.
> > > >> Instead, the caller, ExecutorStartExt() in this case, creates a new
> > > >> plan.
> > > >>
> > > >> This avoids the need to add ExecPlanStillValid() checks anywhere,
> > > >> whether in core or extension code. However, it does mean accessing the
> > > >> PlannedStmt earlier than InitPlan(), but the current placement of the
> > > >> code is not exactly set in stone.
> > > >
> > > > I tried this approach and found that it essentially disables testing
> > > > of this patch using the delay_execution module, which relies on the
> > > > ExecutorStart_hook(). The way the testing works is that the hook in
> > > > delay_execution.c pauses the execution of a cached plan to allow a
> > > > concurrent session to drop an index referenced in the plan. When
> > > > unpaused, execution initialization resumes by calling
> > > > standard_ExecutorStart(). At this point, obtaining the lock on the
> > > > partition whose index has been dropped invalidates the plan, which the
> > > > hook detects and reports. It then also reports the successful
> > > > re-execution of an updated plan that no longer references the dropped
> > > > index. Hmm.
> > > >
> > >
> > > It's not clear to me why the change disables this testing, and I can't
> > > try without a patch. Could you explain?
> >
> > Sorry, PFA the delta patch for the change I described above. It
> > applies on top of v58 series of patches that I posted yesterday.
> > You'll notice that delay_execution test fails if you apply and do
> > check-world.
> >
> > As for how the change breaks the testing, here is a before and after
> > of the flow of a isolation test in
> > src/test/modules/delay_execution/specs/cached-plan-inval.spec (s1 is
> > the session used to run a cached plan, s2 to perform concurrent DDL
> > that invalidates the plan):
> >
> > * Before (working):
> >
> > 1. s2 takes advisory lock
> > 2. s1 runs cached plan -> goes to ExecutorStart_hook -> waits for the
> > advisory lock
> > 3. s2 drops an index referenced in the plan
> > 4. s2 unlocks advisory lock
> > 5. s1 locks unpruned partitions -> detects plan invalidation due to
> > dropped index.
> >
> > * After (stops working because initial pruning and locking are done
> > before calling ExecutorStart_hook):
> >
> > 1. s2 takes advisory lock
> > 2. s1 runs cached plan -> locks unpruned partitions -> goes to
> > ExecutorStart_hook to get advisory lock -> waits for advisory lock
> > 3. s2 drops an index referenced in the plan -> waits for lock on the
> > unpruned partition -> deadlock!
> >
> > One idea I had after sending the email yesterday is to introduce
> > ExecutorStartCachedPlan_hook for the advisory lock based waiting.
> > ExecutorStartCachedPlan() is the new function that you will find in
> > v58-0004 that wraps ExecutorStart() to handle plan invalidation. This
> > new hook would be called before ExecutorStartCachedPlan() calls
> > ExecutorStart(), so the original testing flow can still work.
>
> Here's that patch with this idea implemented that fixes the
> delay_execution test breakage. Applies on top of v58 series of
> patches.
>
> However, as mentioned in my previous reply, since extensions might
> need to adjust their ExecutorStart hook code to check if the RT index
> is in EState.es_unpruned_relids when accessing child relations
> directly via ExecGetRangeTableRelation(), I can accept them also
> adding a check for ExecPlanStillValid() in their ExecutorStart hook.
> So we may not want to add a new hook even if only for testing.
One thing I realized about the es_unpruned_relids bitmapset is that
ExecGetRangeTableRelation() should verify that any RT index passed to
it is a member of the bitmapset. If the RT index is not included, the
function should throw an error to catch such cases. I made that change
in 0004. This approach can help identify extensions that manipulate RT
entries belonging to potentially pruned partitions, provided they use
the ExecGetRangeTableRelation() interface to open those relations.
To summarize how extensions can be affected:
1. Plan invalidation during standard_ExecutorStart(): A plan tree
originating from a CachedPlan can become invalid during
standard_ExecutorStart() due to locks taken on leaf partitions that
survive initial pruning. Extensions should be updated to handle this
scenario by checking ExecPlanStillValid(estate) immediately after
calling standard_ExecutorStart() in their ExecutorStart_hook. If it
returns false, the extensions should avoid further processing.
2. Validation of RT indexes: If the plan tree remains valid, any
direct manipulation of relations using RT indexes must first verify
that the RT index is present in the EState.es_unpruned_relids
bitmapset. This bitmapset includes: a) RT indexes of relations that
are originally unprunable (and locked during GetCachedPlan()), and
b) RT indexes of leaf partitions that survive initial partition
pruning. This step is crucial because pruned relations are not locked.
Additionally, with the update in 0004, attempting to open pruned
relations using ExecGetRangeTableRelation() will result in an error.
I’d love to hear from anyone maintaining executor hooks, such as those
from Timescale, Citus, or other extension developers. Please give this
patch set (0001-0004) a try and let me know if you run into any issues
or have feedback. 0005 is a sketch of an approach that eliminates the
need for extensions to check ExecPlanStillValid() in their
ExecutorStart_hook.
--
Thanks, Amit Langote
Attachment | Content-Type | Size |
---|---|---|
v59-0002-Initialize-PartitionPruneContexts-lazily.patch | application/octet-stream | 16.9 KB |
v59-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch | application/octet-stream | 20.4 KB |
v59-0003-Perform-runtime-initial-pruning-outside-ExecInit.patch | application/octet-stream | 14.6 KB |
v59-0005-Remove-the-need-to-check-if-plan-is-valid-from-E.patch | application/octet-stream | 24.4 KB |
v59-0004-Defer-locking-of-runtime-prunable-relations-in-c.patch | application/octet-stream | 119.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2024-12-12 08:00:34 | Re: Fix comments related to pending statistics |
Previous Message | Shubham Khanna | 2024-12-12 07:57:41 | Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility. |