Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>
Subject: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0
Date: 2015-02-16 08:00:24
Message-ID: 54E1A398.6080304@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/16/2015 02:44 AM, Peter Geoghegan wrote:
> On Sat, Feb 14, 2015 at 2:06 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> I did not solve the potential for livelocks (discussed at
>> http://www.postgresql.org/message-id/CAM3SWZTfTt_fehet3tU3YKCpCYPYnNaUqUZ3Q+NAASnH-60teA@mail.gmail.com)
>> The patch always super-deletes the already-inserted tuple on conflict, and
>> then waits for the other inserter. That would be nice to fix, using one of
>> the ideas from that thread, or something else.
>
> How about we don't super-delete at all with exclusion constraints? I'd
> be willing to accept unprincipled deadlocks for exclusion constraints,
> because they already exist today for UPSERT/NOSERT type use cases, and
> with idiomatic usage seem far less likely for the IGNORE variant
> (especially with exclusion constraints).

So INSERT ON CONFLICT IGNORE on a table with an exclusion constraint
might fail. I don't like that. The point of having the command in the
first place is to deal with concurrency issues. If it sometimes doesn't
work, it's broken.

> I can see people using ON
> CONFLICT UPDATE where they'd almost or actually be better off using a
> plain UPDATE - that's quite a different situation. I find livelocks to
> be a *very* scary prospect, and I don't think the remediations that
> were mentioned are going to fly. It's just too messy, and too much of
> a niche use case. TBH I am skeptical of the idea that we can fix
> exclusion constraints properly in this way at all, at least not
> without the exponential backoff thing, which just seems horrible.

The idea of comparing the TIDs of the tuples as a tie-breaker seems most
promising to me. If the conflicting tuple's TID is smaller than the
inserted tuple's, super-delete and wait. Otherwise, wait without
super-deletion. That's really simple. Do you see a problem with that?

> Although you understood what I was on about when I first talked about
> unprincipled deadlocks, I think that acceptance of that idea came
> much later from other people, because it's damn complicated.

BTW, it would good to explain somewhere in comments or a README the term
"unprincipled deadlock". It's been a useful concept, and hard to grasp.
If you defined it once, with examples and everything, then you could
just have "See .../README" in other places that need to refer it.

> It's not worth it to add
> some weird "Dining philosophers" exponential backoff thing to make
> sure that the IGNORE variant when used with exclusion constraints can
> never deadlock in an unprincipled way, but if it is worth it then it
> seems unreasonable to suppose that this patch needs to solve that
> pre-existing problem. No?

The point of solving the existing problem is that it allows us to split
the patch, for easier review.

>> Are we happy with acquiring the SpeculativeInsertLock heavy-weight lock for
>> every insertion? That seems bad for performance reasons. Also, are we happy
>> with adding the new fields to the proc array? Another alternative that was
>> discussed was storing the speculative insertion token on the heap tuple
>> itself. (See
>> http://www.postgresql.org/message-id/52D00D2D.6030307@vmware.com)
>
> Whatever works, really. I can't say that the performance implications
> of acquiring that hwlock are at the forefront of my mind. I never
> found that to be a big problem on an 8 core box, relative to vanilla
> INSERTs, FWIW - lock contention is not normal, and may be where any
> heavweight lock costs would really be encountered.

Oh, cool. I guess the fast-path in lmgr.c kicks ass, then :-).

> I don't see how storing the promise token in heap tuples buys us not
> having to involve heavyweight locking of tokens. (I think you may have
> made a thinko in suggesting otherwise)

It wouldn't get rid of heavyweight locks, but it would allow getting rid
of the procarray changes. The inserter could generate a token, then
acquire the hw-lock on the token, and lastly insert the heap tuple with
the correct token.

>> Are we happy with the way super-deletion works? Currently, the xmin field is
>> set to invalid to mark the tuple as super-deleted. That required checks in
>> HeapTupleSatisfies* functions. One alternative would be to set xmax equal to
>> xmin, and teach the code currently calls XactLockTableWait() to check if
>> xmax=xmin, and not consider the tuple as conflicting.
>
> That couldn't work without further HeapTupleSatisfiesDirty() logic.

Why not?

> Besides, why should one transaction be entitled to insert a
> conflicting value tuple just because a still running transaction
> deleted it (having also inserted the tuple itself)? Didn't one
> prototype version of value locking #2 have that as a bug [1]? In fact,
> originally, didn't the "xmin set to invalid" thing come immediately
> from realizing that that wasn't workable?

Ah, right. So the problem was that some code might assume that if you
insert a row, delete it in the same transaction, and then insert the
same value again, the 2nd insertion will always succeed because the
previous insertion effectively acquired a value-lock on the key.

Ok, so we can't unconditionally always ignore tuples with xmin==xmax. We
would need an infomask bit to indicate super-deletion, and only ignore
it if the bit is set.

I'm starting to think that we should bite the bullet and consume an
infomask bit for this. The infomask bits are a scarce resource, but we
should use them when it makes sense. It would be good for forensic
purposes too, to leave a trace that a super-deletion happened.

> I too think the tqual.c changes are ugly. I can't see a way around
> using a token lock, though - I would only consider (and only consider)
> refining the interface by which a waiter becomes aware that it must
> wait on the outcome of the inserting xact's speculative
> insertion/value lock (and in particular, that is should not wait on
> xact outcome). We clearly need something to wait on that isn't an
> XID...heavyweight locking of a token value is the obvious way of doing
> that.

Yeah.

> Jim Nasby said something about setting the HEAP_XMIN_INVALID hint bit.
> Maybe he is right...if that can be made to be reliable (always
> WAL-logged), it could be marginally better than setting xmin to
> invalidTransactionId.

I'm not a big fan of that. The xmin-invalid bit is currently always just
a hint.

> But only marginally; it seems like your
> discomfort is basically inherent to playing these new games with
> visibility, which is inherent to this design.

No, I get that we're playing games with visibility. I want to find the
best way to implement those games.

> As I said, I am having a
> hard time seeing a way to do anything more than polish what we have
> here. That's probably fine, though...it hasn't proven to be too
> problematic (exclusion constraints aside).
>
> Not having to change HeapTupleSatisfiesMVCC() and so on (to check if
> xmin = InvalidTransactionId) is not obviously a useful goal here,
> since ISTM that any alternative would have to *logically* do the same
> thing. If I'm off the mark about your thinking here, please correct
> me....are you just worried about extra cycles in the
> HeapTupleSatisfies* routines?

Extra cycles yes, but even more importantly, code readability and
maintainability.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-02-16 08:41:59 Re: Allow "snapshot too old" error, to prevent bloat
Previous Message Michael Paquier 2015-02-16 07:52:07 Re: pg_dump gets attributes from tables in extensions