Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Date: 2013-09-26 19:33:34
Message-ID: CA+TgmoY94DRtdnyGfC0bg3Dy+nzqavCTOYX57Sf6YY0=szxKZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 26, 2013 at 3:07 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> When you consider that the feature will frequently be used with the
> assumption that updating is a much more likely outcome, it becomes
> clear that we need to be careful about this sort of interplay.

I think one thing that's pretty clear at this point is that almost any
version of this feature could be optimized for either the insert case
or the update case. For example, my proposal could be modified to
search for a conflicting tuple first, potentially wasting an index
probes (or multiple index probes, if you want to search for potential
conflicts in multiple indexes) if we're inserting, but winning heavily
in the update case. As written, it's optimized for the insert case.

In fact, I don't know how to know which of these things we should
optimize for. I wrote part of the code for an EDB proprietary feature
that can do insert-or-update loads about 6 months ago[1], and we
optimized it for updates. That was not, however, a matter of
principal; it just turned out to be easier to implement that way. In
fact, I would have assumed that the insert-mostly case was more
likely, but I think the real answer is that some environments will be
insert-mostly and some will be update-mostly and some will be a mix.

If we really want to squeeze out every last drop of possible
performance, we might need two modes: one that assumes we'll mostly
insert, and another that assumes we'll mostly update. That seems a
frustrating amount of detail to have to expose to the user; an
implementation that was efficient in both cases would be very
desirable, but I do not have a good idea how to get there.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] In case you're wondering, attempting to use that feature to upsert
an invisible tuple will result in the load failing with a unique index
violation.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-09-26 19:46:23 Re: Minmax indexes
Previous Message Robert Haas 2013-09-26 19:15:16 Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE