Re: WIP fix proposal for bug #6123

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Florian Pflug" <fgp(at)phlo(dot)org>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP fix proposal for bug #6123
Date: 2011-08-08 15:02:26
Message-ID: 4E3FB432020000250003FC27@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Florian Pflug <fgp(at)phlo(dot)org> wrote:
>> Going down that road opens the door to a *lot* of subtle semantic
>> differences between currently equivalent queries. For example,
>>
>> UPDATE T SET f=f, a=1
>>
>> would behave differently then
>>
>> UPDATE T SET a=1
>>
>> because in the first case the new row would depend on the old
>> row's value of "f", while in the second case it doesn't.
>
> True. But I'm not really bothered by that. If the user gets an
> error message that says:
>
> ERROR: updated column "f" has already been modified by a BEFORE
> trigger
>
> ...the user will realize the error of their ways.
>
>>> There's no way to get the same result as if you'd done either
>>> one of them first, because >>> they are inextricably
>>> intertwined.
>>>
>>> In practice, my hand-wavy reference to "reconciling the updates"
>>> is a problem because of the way the trigger interface works. It
>>> feels pretty impossible to decide that we're going to do the
>>> update, but with some other random values we dredged up from
>>> some other update replacing some of the ones the user explicitly
>>> handed to us. But if the trigger could return an indication of
>>> which column values it wished to override, then it seems to me
>>> that we could piece together a reasonable set of semantics.
>>> It's not exactly clear how to make that work, though.
>>
>> I dunno, that all still feels awfully complex. As you said
>> yourself, this case is quite similar to a serialization anomaly.
>> Taking that correspondence further, that reconciliation of
>> updates is pretty much what the EPQ machinery does in READ
>> COMMITTED mode. Now, we ourselves have warned users in the past
>> to *not* use READ COMMITTED mode if they do complex updates
>> (e.g., use UPDATE ... FROM ...), because the behaviour of that
>> reconciliation machinery in the present of concurrent updates is
>> extremely hard to predict. I thus don't believe that it's a good
>> idea to introduce similarly complex behaviour in other parts of
>> the system - and particularly not if you cannot disable it by
>> switching to another isolation level.
>>
>> Simply throwing an error, on the other hand, makes the behaviour
>> simple to understand and explain.
>
> True. I'm coming around to liking that behavior better than I did
> on first hearing, but I'm still not crazy about it, because as an
> app developer I would really like to have at least the
> unproblematic cases actually work. Throwing an error at least
> makes it clear that you've done something which isn't supported,
> and that's probably an improvement over the current, somewhat-
> baffling behavior. However, it's not even 25% as nice as having it
> actually work as intended. That's why, even if we can't make all
> the cases work sanely, I'd be a lot more enthusiastic about it if
> we could find a way to make at least some of them work sanely.
> The mind-bending cases are unlikely to be the ones people do on
> purpose.

So, we have three options on the table here:

(1) We (the Wisconsin Courts) are using a very simple fix to work
around the issue so we can move forward with conversion to
PostgreSQL triggers. A DELETE is allowed to complete if the BEFORE
trigger doesn't return NULL, even if the row was updated while the
trigger was executing. An UPDATE fails if the BEFORE trigger
doesn't return NULL and any other update is made to the row while
the trigger was executing.

(2) You (Robert) have proposed (as I understand it) modifying that
approach to allow some UPDATE cases to work, where there are not
conflicting updates to any one column within the row.

(3) Florian has proposed banning all updates to a row which is being
processed by a BEFORE UPDATE or BEFORE DELETE trigger. As I
understand it, this would be similar to the approach taken by
Oracle, although less strict.

I can live with any of these solutions. How do we move forward to
consensus so that a patch can be written? Does anyone else have a
preference for one of these approaches versus another?

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesper Krogh 2011-08-08 15:11:12 Re: mosbench revisited
Previous Message Andres Freund 2011-08-08 14:49:01 Re: fstat vs. lseek