From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Inadequate executor locking of indexes |
Date: | 2018-11-08 00:13:56 |
Message-ID: | 19465.1541636036@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I discovered that it's possible to trigger relation_open's new assertion
about having a lock on the relation by the simple expedient of running
the core regression tests with plan_cache_mode = force_generic_plan.
(This doesn't give me a terribly warm feeling about how thoroughly we've
exercised the relevant code, but I don't know what to do about that.)
The reason for the crash is that nodeIndexscan.c and friends all open
their index-to-scan with code like
/*
* Open the index relation.
*
* If the parent table is one of the target relations of the query, then
* InitPlan already opened and write-locked the index, so we can avoid
* taking another lock here. Otherwise we need a normal reader's lock.
*/
relistarget = ExecRelationIsTargetRelation(estate, node->scan.scanrelid);
indexstate->iss_RelationDesc = index_open(node->indexid,
relistarget ? NoLock : AccessShareLock);
Now, at the time this code was written, InitPlan did indeed ensure that
indexes of a plan's result relation were already opened and locked,
because it calls InitResultRelInfo which used to call ExecOpenIndices.
Nowadays, the ExecOpenIndices part is postponed to ExecInitModifyTable,
which figures it can optimize matters:
/*
* If there are indices on the result relation, open them and save
* descriptors in the result relation info, so that we can add new
* index entries for the tuples we add/update. We need not do this
* for a DELETE, however, since deletion doesn't affect indexes. Also,
* inside an EvalPlanQual operation, the indexes might be open
* already, since we share the resultrel state with the original
* query.
*/
if (resultRelInfo->ri_RelationDesc->rd_rel->relhasindex &&
operation != CMD_DELETE &&
resultRelInfo->ri_IndexRelationDescs == NULL)
ExecOpenIndices(resultRelInfo,
node->onConflictAction != ONCONFLICT_NONE);
Therefore, in a plan consisting of a DELETE ModifyTable atop an indexscan
of the target table, we are opening the index without the executor
acquiring any lock on the index. The only thing that saves us from
triggering that Assert is that in most cases the planner has already
taken a lock on any index it considered (and not released that lock).
But using a prepared plan breaks that.
This problem is ancient; it's unrelated to the recent changes to reduce
executor locking, because that only concerned table locks not index locks.
I'm not certain how much real-world impact it has, since holding a lock
on the index's parent table is probably enough to prevent dire things
from happening in most cases. Still, it seems like a bug.
There are several things we might do to fix this:
1. Drop the "operation != CMD_DELETE" condition from the above-quoted bit
in ExecInitModifyTable. We might be forced into that someday anyway if
we want to support non-heap-style tables, since most other peoples'
versions of indexes do want to know about deletions.
2. Drop the target-table optimization from nodeIndexscan.c and friends,
and just always open the scan index with AccessShareLock. (BTW, it's
a bit inconsistent that these nodes don't release their index locks
at the end; ExecCloseIndices does.)
3. Teach nodeIndexscan.c and friends about the operation != CMD_DELETE
exception, so that they get the lock for themselves in that case. This
seems pretty ugly/fragile, but it's about the only option that won't end
in adding index lock-acquisition overhead in some cases. (But given the
planner's behavior, it's not clear to me how often that really matters.)
BTW, another thing that is bothering me about this is that, both with
the current code and options #1 and #3, there's an assumption that any
indexscan on a query's target table must be underneath the relevant
ModifyTable plan node. Given some other plan tree structure it might
be possible to initialize the indexscan node before the ModifyTable,
breaking the assumption that the latter already got index locks. I'm
not sure if that's actually possible at present, but it doesn't seem
like I'd want to bet on it always being so in the future. This might
be an argument for going with option #2, which eliminates all such
assumptions.
Another idea is to make things work more like we just made them work for
tables, which is to ensure that all index locks are taken before reaching
the executor, or at least as part of some other processing than the plan
tree walk. That would be a good deal more work than any of the options
listed above, though, and it would definitely not be back-patchable if we
decide we need a back-patched fix.
Thoughts?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Imai, Yoshikazu | 2018-11-08 00:34:00 | RE: Small performance tweak to run-time partition pruning |
Previous Message | David Rowley | 2018-11-08 00:06:52 | Re: ATTACH/DETACH PARTITION CONCURRENTLY |