From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE |
Date: | 2014-01-11 03:59:01 |
Message-ID: | CAM3SWZRi8hhQ59aaiBgey5PNt1MEZ6jUnbv7wqd=Y-k-B1Vpvg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 10, 2014 at 7:09 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> Nah, that's nothing. I have a patch in the January commitfest that was
> already posted for the previous commitfest. It received zero review back
> then, and still has no reviewer signed up, let alone anyone actually
> reviewing it. And arguably it's a bug fix!
>
> http://www.postgresql.org/message-id/5285071B.1040100@vmware.com
>
> Wink wink, if you're looking for patches to review... ;-)
Yeah, I did intend to take a closer look at that one (I've looked at
it but have nothing to share yet). I've been a little busy with other
things. That patch is more of the kind where it's a matter of
determining if what you've done is exactly correct (no one would
disagree with the substance of what you propose), whereas there is
uncertainty about whether I've gotten the semantics right and so on.
But that's no excuse. :-)
>> The alternative exclusion* patch still deadlocks in an unprincipled
>> fashion
> Here's an updated patch. Hope it works now... This is based on an older
> version, and doesn't include any fixes from your latest
> btreelock_insert_on_dup.v7.2014_01_07.patch. Please check the common parts,
> and copy over any relevant changes.
Okay, attached is a revision with some of my fixes for other parts of
the code merged (in particular, for the grammer, ecpg and some aspects
of row locking and visibility).
Some quick observations on your patch - maybe this is obvious, and you
have work-arounds in mind, but this is just my first impression:
* You're always passing HEAP_INSERT_SPECULATIVE to heap_insert, and
therefore in the event of any sort of insertion always getting an
exclusive lock on the procArray. I guess the fact that this always
happens, and not just when upserting is an oversight (I know you just
wanted to throw together a POC), but even still that seems kind of
questionable. Everyone knows that contention during GetSnapshotData is
still a big problem for us. Taking an exclusive ProcArrayLock perhaps
as frequently as more than once per slot seems like a really bad idea,
even if it's limited to speculative inserters.
* It seems questionable that you don't at least have a shared
ProcArrayLock when you set the token value in
SetSpeculativeInsertionToken() (as you know, MyPgXact->xmin can be set
with such a shared lock, so doing something similar here might be
okay, but it's far from obvious that no lock is okay). Now, I guess
you can point towards MinimumActiveBackends() as a kind of precedent,
but that seems substantially less scary than what you've done, because
that's just reading if a field is zero or non-zero. Obviously the
implications of actually doing this are that things get even worse for
performance. And even a shared lock might not be good enough - I'd
have to think about it some more to give a firmer opinion.
> The fix for the deadlocking issue consists of a few parts. First, there's a
> new heavy-weight lock type, a speculative insertion lock, which works
> somewhat similarly to XactLockTableWait(), but is only held for the duration
> of a single speculative insertion. When a backend is about to begin a
> speculative insertion, it first acquires the speculative insertion lock.
> When it's done with the insertion, meaning it has either cancelled it by
> killing the already-inserted tuple or decided that it's going to go ahead
> with it, the lock is released.
I'm afraid I must reiterate my earlier objection to the general thrust
of what you're doing, which is that it is evidently unnecessary to
spread knowledge of value locking around the system, as opposed to
localizing knowledge of it to one module, in this case nbtinsert.c.
While it's true that the idea of the AM abstraction is already perhaps
a little strained, this seems like a big expansion on that problem.
Why should this approach make sense for every conceivable AM that
supports some notion of a constraint? Heavyweight exclusive locks on
indexes (at the page level typically), persisting across complex
operations are not a new thing for Postgres.
> HeapTupleSatisfiesDirty() does the proc array check, and returns the token
> in the snapshot, alongside snapshot->xmin. The caller can then use that
> information in place of XactLockTableWait().
That seems like a modularity violation too. The HeapTupleSatisfiesMVCC
changes reflect a genuine need to make every MVCC snapshot care about
the special visibility exception, whereas only one or two
HeapTupleSatisfiesDirty() callers will ever care about speculative
insertion. Even if you're unmoved by the modularity/aesthetic argument
(which is not to suppose that you actually are), the fact that you're
calling SpeculativeInsertionIsInProgress(), which acquires a shared
ProcArrayLock much of the time from within HeapTupleSatisfiesDirty(),
may have seriously regressed foreign key enforcement, for example.
You're going to need something like a new type of snapshot, basically,
and we probably already have too many of those. But then, can you
really get away with a new snapshot type so most existing places are
unaffected? Why shouldn't ri_triggers.c have to care? Offhand I think
it must care, unless you go give it some special knowledge too. So you
either risk regressing performance badly, or play whack-a-mole with
all of the other dirty snapshot callsites. That seems like a textbook
example of a really bad modularity violation. The consequences may
spread beyond that, further than we can easily predict.
>> More importantly, I have to question
>> whether we should continue to pursue that alternative approach, giving
>> what we now know about its performance characteristics.
>
> Yes.
Okay. Unfortunately, I must press you on this point: what is it that
you don't like about what I've done? What aspects of my approach
concern you, and specifically what aspects of my approach do you hope
to avoid? If you take a close look at how value locking is performed,
it actually is very similar to the existing mechanism,
counterintuitive though that is. It's a modest expansion on how things
already work. I contend that my approach is, apart from everything
else, the more conservative of the two.
>> It could be
>> improved, but not by terribly much, particularly for the case where
>> there is plenty of update contention, which was shown in [1] to be
>> approximately 2-3 times slower than extended page locking (*and* it's
>> already looking for would-be duplicates*first*). I'm trying to be as
>>
>> fair as possible, and yet the difference is huge.
>
> *shrug*. I'm not too concerned about performance during contention. But
> let's see how this fixed version performs. Could you repeat the tests you
> did with this?
Why would you not be too concerned about the performance with
contention? It's a very important aspect. But even if you don't, if
you look at the transaction throughput with only a single client in
the update-heavy benchmark [1] (with one client there is a fair mix of
inserts and updates), my approach still comes out far ahead.
Transaction throughput is almost 100% higher, with the *difference*
exceeding 150% at 8 clients but never reaching too much higher. I
think that the problem isn't so much with contention between clients
as much as with contention between inserts and updates, which affects
everyone to approximately the same degree. And the average max latency
across runs for one client is 130.447 ms, as opposed to 0.705 ms with
my patch - that's less than 1%. Whatever way you cut it, the
performance of my approach is far superior. Although we should
certainly investigate the impact of your most recent revision, and I
intend to, how can you not consider those differences to be extremely
significant? And honestly, I have a really hard time imagining what
you've done here had anything other than a strong negative effect on
performance, in which case the difference in performance will be wider
still.
> Any guesses what the bottleneck is? At a quick glance at a profile of a
> pgbench run with this patch, I didn't see anything out of ordinary, so I'm
> guessing it's lock contention somewhere.
See my previous remarks on "index scan with an identity crisis" [2].
I'm pretty sure it was mostly down to the way you optimistically
proceed with duplicate index tuple insertion (which you'll do even
with the btree code knowing that it almost certainly won't work out,
something that makes less sense than with deferred unique constraints,
where the user has specifically indicated that things may well work
out by making the constraint deferred in the first place). I also
think that the way my approach very effectively avoids wasted effort
(including but not limited to unnecessarily heavy locking) plays an
important role in making it perform so well. This turns out to be much
more important than the downside of having value locks be slightly
coarser than strictly necessary. When I tried to quantify that
overhead with a highly unsympathetic benchmark, the difference was
barely measurable [2][3].
[1] http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/upsert-cmp-2/
[3] http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/vs-vanilla-insert/
--
Peter Geoghegan
Attachment | Content-Type | Size |
---|---|---|
speculative-insertions-revisions.2014_01_10.patch.gz | application/x-gzip | 27.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2014-01-11 04:47:07 | Re: [PATCH] Store Extension Options |
Previous Message | Peter Eisentraut | 2014-01-11 03:46:41 | Re: [PATCH] Add transforms feature |