Quick Links

Re: why partition pruning doesn't work?

From:	Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: why partition pruning doesn't work?
Date:	2018-07-18 08:59:52
Message-ID:	ab10ce44-18b0-244a-c9c8-67d7020e2ba9@lab.ntt.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2018/07/16 2:02, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> writes:
>> On 2018/06/19 2:05, Tom Lane wrote:
>>> Or maybe what we should do is drop ExecLockNonLeafAppendTables/
>>> ExecOpenAppendPartitionedTables entirely and teach InitPlan to do it.
>
>> Hmm, for InitPlan to do what ExecOpenAppendPartitionedTables does, we'd
>> have to have all the partitioned tables (contained in partitioned_rels
>> fields of all Append/MergeAppend/ModifyTable nodes of a plan) also listed
>> in a global list like rootResultRelations and nonleafResultRelations of
>> PlannedStmt.
>
>> Attached updated patch.
>
> I've been looking at this patch, and while it's not unreasonable on its
> own terms, I'm growing increasingly distressed at the ad-hoc and rather
> duplicative nature of the data structures that have gotten stuck into
> plan trees thanks to partitioning (the rootResultRelations and
> nonleafResultRelations lists being prime examples).
>
> It struck me this morning that a whole lot of the complication here is
> solely due to needing to identify the right type of relation lock to take
> during executor startup, and that THAT WORK IS TOTALLY USELESS.

I agree. IIUC, that's also the reason why PlannedStmt had to grow a
resultRelations field in the first place.

> In every
> case, we must already hold a suitable lock before we ever get to the
> executor; either one acquired during the parse/plan pipeline, or one
> re-acquired by AcquireExecutorLocks in the case of a cached plan.
> Otherwise it's entirely possible that the plan has been invalidated by
> concurrent DDL --- and it's not the executor's job to detect that and
> re-plan; that *must* have been done upstream.
>
> Moreover, it's important from a deadlock-avoidance standpoint that these
> locks get acquired in the same order as they were acquired during the
> initial parse/plan pipeline. I think it's reasonable to assume they were
> acquired in RTE order in that stage, so AcquireExecutorLocks is OK. But,
> if the current logic in the executor gets them in that order, it's both
> non-obvious that it does so and horribly fragile if it does, seeing that
> the responsibility for this is split across InitPlan,
> ExecOpenScanRelation, and ExecLockNonLeafAppendTables.
>
> So I'm thinking that what we really ought to do here is simplify executor
> startup to just open all rels with NoLock, and get rid of any supporting
> data structures that turn out to have no other use. (David Rowley's
> nearby patch to create a properly hierarchical executor data structure for
> partitioning info is likely to tie into this too, by removing some other
> vestigial uses of those lists.)

I agree it would be nice to be able to get rid of those lists.

> I think that this idea has been discussed in the past, and we felt at
> the time that having the executor take its own locks was a good safety
> measure, and a relatively cheap one since the lock manager is pretty
> good at short-circuiting duplicative lock requests. But those are
> certainly not free. Moreover, I'm not sure that this is really a
> safety measure at all: if the executor were really taking any lock
> not already held, it'd be masking a DDL hazard.
>
> To investigate this further, I made the attached not-meant-for-commit
> hack to verify whether InitPlan and related executor startup functions
> were actually taking any not-previously-held locks. I could only find
> one such case: the parser always opens tables selected FOR UPDATE with
> RowShareLock, but if we end up implementing the resulting row mark
> with ROW_MARK_COPY, the executor is acquiring just AccessShareLock
> (because ExecOpenScanRelation thinks it needs to acquire some lock).
> The patch as presented hacks ExecOpenScanRelation to avoid that, and
> it passes check-world.
>
> What we'd be better off doing, if we go this route, is to install an
> assertion-build-only test that verifies during relation_open(NoLock)
> that some kind of lock is already held on the rel. That would protect
> not only the executor, but a boatload of existing places that open
> rels with NoLock on the currently-unverified assumption that a lock is
> already held.

> I'm also rather strongly tempted to add a field to RangeTblEntry
> that records the appropriate lock strength to take, so that we don't
> have to rely on keeping AcquireExecutorLocks' logic to decide on the
> lock type in sync with whatever the parse/plan pipeline does. (One
> could then imagine adding assertions in the executor that this field
> shows a lock strength of at least X, in place of actually opening
> the rel with X.)
>
> BTW, there'd be a lot to be said for having InitPlan just open all
> the rels and build an array of Relation pointers that parallels the
> RTE list, rather than doing heap_opens in random places elsewhere.

+1 to this. Actually I had written such a patch in place of the one I
posted on this thread, but wasn't confident enough that I had found and
fixed all the places in the executor that would use the Relation pointer
from there instead of fetching it themselves.

Thanks,
Amit

In response to

Re: why partition pruning doesn't work? at 2018-07-15 17:02:22 from Tom Lane

Responses

Re: why partition pruning doesn't work? at 2018-07-18 13:32:52 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dean Rasheed	2018-07-18 09:14:56	Re: PG 10: could not generate random cancel key
Previous Message	Alexander Korotkov	2018-07-18 08:59:02	Re: Bug in gin insert redo code path during re-compression of empty gin data leaf pages