From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Making joins involving ctid work for the benefit of UPSERT |
Date: | 2014-07-28 20:41:53 |
Message-ID: | CAM3SWZQ7d3nLECwLSvDHtZKsGZ04X0Zet9TzpN-g-T2gFZuwkA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 28, 2014 at 10:43 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On a mostly unrelated note, I'll remind you of the reason that I felt
> it was best to lock indexes. It wasn't so much about avoiding bloat as
> it was about avoiding deadlocks. When I highlighted the issue,
> Heikki's prototype, which did insert optimistically rather than
> locking, was then made to go and physically "super delete" the
> upsert-insert conflicting heap tuple (inserted optimistically before
> its index tuples), before going to lock a row, in order to avoid
> unprincipled deadlocking. In contrast, my design just used a callback
> that released page level heavyweight locks before going to lock a row.
> Heikki's prototype involved making it so that *even someone else's
> dirty snapshot* didn't see our dead speculatively-inserted heap tuple.
>
> Anyway, making all that happen is fairly invasive to a bunch of places
> that are just as critical as the nbtree code.
I think I should be more concrete about why this is more complicated
than it first appears. Andres said at pgCon that he would still go
with a design where "promise tuples" are inserted into indexes ahead
of any heap tuple (which differs from Heikki's prototype, where heap
tuple insertion occurs first). Accounting for deadlocking issues could
be particularly problematic with that design, since we must kill
tuples from each index in turn before row locking. In any case the
need to efficiently *release* locks must be weighed carefully. It
isn't obvious, but we must release locks if there is a conflict.
After Heikki acknowledged the problem [1], he produced a revision
addressing it. The details of the workaround and a patch were posted
[2]. I think it's fair to say that this area is a lot messier than it
first appears. If anyone wants to propose an alternative design, they
are of course quite welcome to, but I ask that the very real
difficulties with those designs be acknowledged. AFAICT, only Heikki
has independently acknowledged these issue on list.
In case it isn't obvious, let me be clear about what I care about
here. I feel it's important to get something that is easy to reason
about - you write a DML statement, and within certain fairly
reasonable parameters Postgres does the rest. I think we should not
accept something that may even occasionally through deadlock errors,
or unique violations, or RC-level serialization failures through no
fault of the user. That would be inferior to the plpgql looping
pattern we promote that does the right thing and avoids all of this.
It would be awful to have to tell users who hit problems like this
that they should just stop doing so much upserting, or use the old
pattern.
[1] http://www.postgresql.org/message-id/52B4AAF0.5090806@vmware.com
[2] http://www.postgresql.org/message-id/52D00D2D.6030307@vmware.com
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2014-07-28 21:48:19 | Re: SKIP LOCKED DATA (work in progress) |
Previous Message | Robert Haas | 2014-07-28 20:36:52 | Re: [w32] test_shm_mq test suite permanently burns connections slots |