From: | Florian Pflug <fgp(at)phlo(dot)org> |
---|---|
To: | Kevin Grittner <kgrittn(at)ymail(dot)com> |
Cc: | Nick Barnes <nickbarnes01(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Subject: | Re: Question about RI checks |
Date: | 2014-10-24 11:53:47 |
Message-ID: | AC19E024-578A-4E1A-87DC-BF068094785A@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Oct23, 2014, at 17:45 , Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
> Every way I look at it, inside a REPEATABLE READ or SERIALIZABLE
> transaction a check for child rows when validating a parent DELETE
> should consider both rows which exist according to the transaction
> snapshot and according to a "current" snapshot.
I've pondered this further, and unfortunately it seems that this
isn't sufficient to guarantee true serializability :-(
Verifying that both snapshots contain exactly the same rows does not
prevent a child row from being inserted and immediately deleted again,
not if both actions happen *after* the parent-updating transaction
took its snapshot, but *before* it takes the crosscheck snapshot.
Let parent(id) and child(id, parent_id) again be two tables with a
FK constraint between them, let <child> be initially empty, and let
<parent> contain a single row (1).
Assume PD is a transaction which deletes all rows from <parent>,
CI a transaction which inserts the row (1, 1) into <child>, and
CD a transaction which deletes that row again.
Even with the extended cross-checking you propose, we'd still allow
the following concurrent schedule
1. PD takes snapshot
2. CI starts and completes
3. CD starts and completes
4. PD deletes from <parent> without complaining, since there were
no conflicting rows at time (1), and none at time (4).
So far, all is well. But add two more tables, called <ci_before_pd>
and <pd_before_cd>, both initially containing one row. Let CI scan
<ci_before_pd>, let PD delete from <ci_before_pd> and scan <pd_before_cd>,
and let CD delete from <pd_before_cd>. In the concurrent schedule from
above, CI will see the row in <ci_before_pd>, and PD will delete it, and
PD will see the row in <pd_before_cd> that CD deletes. Note that SSI *will*
allow that schedule to occur without raising a serialization error
The only serial schedule which yields the same results for the various
queries pertaining <ci_before_pd> and <pd_before_cd> is
CI -> PD -> CD,
i.e. PD has to run *between* CI and CD. But in that serial schedule,
PD *should* see a FK key violation, since CI has inserted a child which
CD hasn't yet deleted.
There is thus *no* serial schedule which yields the same results as the
concurrent schedule above for the queries pertaining <parent> and <child>,
*and* for the queries pertaining <ci_before_pd> and <pd_before_cd>, i.e
the concurrent schedule is *not* serializable. Yet even the extended cross-
check won't detect this.
Attached is an isolationtester spec file which implements this example,
and the corresponding out-file which shows that SSI permits the concurrent
schedule. Since SSI doesn't concern itself with RI enforcement queries,
it would also permit that schedule if we extended the cross-check, I think.
(I used REL9_4_STABLE as of today to try this, commit
1cf54b00ba2100083390223a8244430643c1ec07)
best regards,
Florian Pflug
Attachment | Content-Type | Size |
---|---|---|
fk-consistency2.spec | application/octet-stream | 2.0 KB |
fk-consistency2.out | application/octet-stream | 1.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2014-10-24 11:58:07 | Re: Function array_agg(array) |
Previous Message | Michael Paquier | 2014-10-24 11:46:19 | Re: Function array_agg(array) |