Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Date: 2013-12-19 04:39:48
Message-ID: CAM3SWZQDdy7cOhxLzwO419CJrB19ZRtKAx9FkRWmmBWjjJAVDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 12, 2013 at 4:18 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> Both of these revisions have identical ad-hoc test cases included as
> new files - see testcase.sh and upsert.sql. My patch doesn't have any
> unique constraint violations, and has pretty consistent performance,
> while yours has many unique constraint violations. I'd like to hear
> your thoughts on the testcase, and the design implications.

I withdraw the test-case. Both approaches behave similarly if you look
for long enough, and that's okay.

I also think that changes to HeapTupleSatisfiesUpdate() are made
unnecessary by recent bug fixes to that function. The test case
previously described [1] that broke that is no longer recreatable, at
least so far.

Do you think that we need to throw a serialization failure within
ExecLockHeapTupleForUpdateSpec() iff heap_lock_tuple() returns
HeapTupleInvisible and IsolationUsesXactSnapshot()? Also, I'm having a
hard time figuring out a good choke point to catch MVCC snapshots
availing of our special visibility rule where they should not due to
IsolationUsesXactSnapshot(). It seems sufficient to continue to assume
that Postgres won't attempt to lock any tid invisible under
conventional MVCC rules in the first place, except within
ExecLockHeapTupleForUpdateSpec(), but what do we actually do within
ExecLockHeapTupleForUpdateSpec()? I'm thinking of a new tqual.c
routine concerning the tuple being in the future that we re-check when
IsolationUsesXactSnapshot(). That's not very modular, though. Maybe
we'd go through heapam.c.

I think it doesn't matter that what now constitute MVCC snapshots
(with the new, special "reach into the future" rule) have that new
rule, for the purposes of higher isolation levels, because we'll have
a serialization failure within ExecLockHeapTupleForUpdateSpec() before
this is allowed to become a problem. In order for the new rule to be
relevant, we'd have to be the Xact to lock in the first place, and as
an xact in non-read-committed mode, we'd be sure to call the new
tqual.c "in the future" routine or whatever. Only upserters can lock a
row in the future, so it is the job of upserters to care about this
special case.

Incidentally, I tried to rebase recently, and saw some shift/reduce
conflicts due to 1b4f7f93b4693858cb983af3cd557f6097dab67b, "Allow
empty target list in SELECT". The fix for that is not immediately
obvious.

So I think we should proceed with the non-conclusive-check-first
approach (if only on pragmatic grounds), but even now I'm not really
sure. I think there might be unprincipled deadlocking should
ExecInsertIndexTuples() fail to be completely consistent about its
ordering of insertion - the use of dirty snapshots (including as part
of conventional !UNIQUE_CHECK_PARTIAL unique index enforcement) plays
a part in this risk. Roughly speaking, heap_delete() doesn't render
the tuple immediately invisible to some-other-xact's dirty snapshot
[2], and I think that could have unpleasant interactions, even if it
is also beneficial in some ways. Our old, dead tuples from previous
attempts stick around, and function as "value locks" to everyone else,
since for example _bt_check_unique() cares about visibility having
merely been affected, which is grounds for blocking. More
counter-intuitive still, we go ahead with "value locking" (i.e. btree
UNIQUE_CHECK_PARTIAL tuple insertion originating from the main
speculative ExecInsertIndexTuples() call) even though we already know
that we will delete the corresponding heap row (which, as noted, still
satisfies HeapTupleSatisfiesDirty() and so is value-lock-like).

Empirically, retrying because ExecInsertIndexTuples() returns some
recheckIndexes occurs infrequently, so maybe that makes all of this
okay. Or maybe it happens infrequently *because* we don't give up on
insertion when it looks like the current iteration is futile. Maybe
just inserting into every unique index, and then blocking on an xid
within ExecCheckIndexConstraints(), works out fairly and performs
reasonably in all common cases. It's pretty damn subtle, though, and I
worry about the worst case performance, and basic correctness issues
for these reasons. The fact that deferred unique indexes also use
UNIQUE_CHECK_PARTIAL is cold comfort -- that only ever has to through
an error on conflict, and only once. We haven't "earned the right" to
lock *all* values in all unique indexes, but kind of do so anyway in
the event of an "insertion conflicted after pre-check".

Another concern that bears reiterating is: I think making the
lock-for-update case work for exclusion constraints is a lot of
additional complexity for a very small return.

Do you think it's worth optimizing ExecInsertIndexTuples() to avoid
futile non-unique/exclusion constrained index tuple insertion?

[1] http://www.postgresql.org/message-id/CAM3SWZS2--GOvUmYA2ks_aNyfesb0_H6T95_k8+wyx7Pi=CQvw@mail.gmail.com

[2] https://github.com/postgres/postgres/blob/94b899b829657332bda856ac3f06153d09077bd1/src/backend/utils/time/tqual.c#L798

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2013-12-19 04:51:31 Re: ALTER SYSTEM SET command to change postgresql.conf parameters
Previous Message Amit Kapila 2013-12-19 03:37:59 Re: Logging WAL when updating hintbit