From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE - visibility semantics |
Date: | 2013-09-24 14:51:14 |
Message-ID: | CA+TgmoZScKe1d7cHJ5WxRi7nXTO0u4RXyYZ61qd0-0BtyL9asw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 24, 2013 at 5:14 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Various messages are discussing semantics around visibility. I by now
> have a hard time keeping track. So let's keep the discussion of the
> desired semantics to this thread.
>
> There have been some remarks about serialization failures in read
> committed transactions. I agree, those shouldn't occur. But I don't
> actually think they are so much of a problem if we follow the path set
> by existing uses of the EPQ logic. The scenario described seems to be an
> UPSERT conflicting with a row it cannot see in the original snapshot of
> the query.
> In that case I think we just have to follow the example laid by
> ExecUpdate, ExecDelete and heap_lock_tuple. Use the EPQ machinery (or an
> alternative approach with similar enough semantics) to get a new
> snapshot and follow the ctid chain. When we've found the end of the
> chain we try to update that tuple.
> That surely isn't free of surprising semantics, but it would follows existing
> semantics. Which everybody writing concurrent applications in read
> committed should (but doesn't) know. Adding a different set of semantics
> seems like a bad idea.
> Robert seems to have been the primary sceptic around this, what scenario
> are you actually concerned about?
I'm not skeptical about offering it as an option; in fact, I just
suggested basically the same thing on the other thread, before reading
this. Nonetheless it IS an MVCC violation; the chances that someone
will be able to demonstrate serialization anomalies that can't occur
today with this new facility seem very high to me. I feel it's
perfectly fine to respond to that by saying: yep, we know that's
possible, if it's a concern in your environment then don't use this
feature. But it should be clearly documented.
I do think that it will be easier to get this to work if we have a
define the operation as REPLACE, bundling all of the magic inside a
single SQL command. If the user issues an INSERT first and then must
try an UPDATE afterwards if the INSERT doesn't actually insert, then
you're going to have problems if the UPDATE can't see the tuple with
which the INSERT conflicted, and you're going to need some kind of a
loop in case the UPDATE itself fails. Even if we can work out all the
details, a single command that does insert-or-update seems like it
will be easier to use and more efficient. You might also want to
insert multiple tuples using INSERT ... VALUES (...), (...), (...);
figuring out which ones were inserted and which ones must now be
updated seems like a chore better avoided.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2013-09-24 15:04:06 | Re: logical changeset generation v6 |
Previous Message | Stephen Frost | 2013-09-24 14:39:21 | Re: record identical operator |