Re: WIP fix proposal for bug #6123

From: Florian Pflug <fgp(at)phlo(dot)org>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP fix proposal for bug #6123
Date: 2011-08-09 14:34:00
Message-ID: 07701AE8-87DD-4C36-8B3B-B7724D96F6EC@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Aug9, 2011, at 15:41 , Kevin Grittner wrote:
> Florian Pflug <fgp(at)phlo(dot)org> wrote:
>
>> To summarize, here's what I currently believe would be a sensible
>> approach:
>>
>> After every BEFORE trigger invocation, if the trigger returned
>> non-NULL, check if latest row version is still the same as when
>> the trigger started. If not, complain.
>
> That certainly has the advantages of being a small, safe change and
> of being easy to explain. It would certainly prevent the
> astonishing sorts of behaviors which now occur and which can leave
> people with database contents they thought they had guards against.
>
> The down side is that something this strict does make it hard to
> achieve certain behaviors which could be desirable for maintaining
> redundant data for performance. In my bottom-up delete scenario,
> there would either need to be somewhere to note that a row was being
> deleted so that the delete of the children could skip maintaining
> it, or the cascade would need to be implemented in the AFTER
> triggers, and validations would need to accept orphans which could
> be created. Either approach can be made to work, but from the
> application development side, it's not as clean or easy.

Another option would be to re-issue the DELETE from the BEFORE DELETE
trigger, and then return NULL. It'll cause the BEFORE DELETE trigger
to be invoked recursively, but presumably the second invocation could
easily detect that all required pre-delete actions have already taken
place and exit early (and return OLD). In pseudo-code, something like

BEFORE DELETE ON <parent table>:
DELETE FROM <child table> WHERE parent_id = OLD.id;
IF FOUND THEN
-- Removing children might have modified our row,
-- so returning non-NULL is not an option
DELETE FROM <table> WHERE id = OLD.id;
RETURN NULL;
ELSE
-- No children removed, so our row should be unmodified
RETURN OLD;
END IF;

Or, more generally, you could check for modifications of the row
to be deleted. I'm not sure if the ctid column is currectly available
for OLD and NEW, but if it is, you could do

BEFORE DELETE ON <table>:
<arbitrarily complex child removal logic>
IF EXISTS(SELECT 1 FROM <table> WHERE ctid = OLD.ctid)
-- Original row wasn't unmodified, actual delete should succeed
RETURN OLD;
ELSE
-- Original was modified, actual delete would fail
DELETE FROM <table> WHERE pk = OLD.pk;
RETURN NULL;
END IF;

This only requires that the "arbitrarily complex child removal logic"
is smart enough not to update the parent table if no children were
actually removed.

> The suggested approach for UPDATE with my original approach to
> DELETE would make me happier, but I'm still not clear on Robert's
> use cases and how that would affect him. Can you clarify why you
> feel UPDATE and DELETE should both do this?

I'm concerned that proceeding with a DELETE even though the row
has been modified has an equal chance of cause subtle data-integrity
violations as the current behaviour. We might remove a row, even though
the row, at the time of deletion, does *not* match the DELETE's WHERE
condition. That seems like a violation of very basic guarantees to me.

A real-world use-case where that might happen could consist of a table
with a "reference_count" column, together with a BEFORE DELETE trigger
which (indirectly via other triggers) sometimes causes a reference_count
to be changed from 0 to >0. Simply inserting a new row instead of re-using
the old one may not be an option because of UNIQUE constraints. The
statement

DELETE FROM reference_counted_table WHERE reference_count = 0

might then delete rows even though they *are* referenced, if the
references where added from the BEFORE DELETE trigger. The result would
probably dangling references, i.e. a data-integrity violation.

One might attempt to solve that by re-checking the DELETE's WHERE
condition before carrying out the DELETE. To avoid ending up where
we started, i.e. with BEFORE triggers firing but no DELETE taking place,
we'd have to error out if the re-check fails instead of silently
ignoring the DELETE. At which point whether or not a DELETE of a single
record succeeds or fails depends on the WHERE condition used to *find*
that record. That, I feel, isn't a road we want to take...

best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2011-08-09 14:53:37 Re: WIP fix proposal for bug #6123
Previous Message Andrew Dunstan 2011-08-09 13:49:30 Re: Problem with sources.