Quick Links

Re: invalid tid errors in latest 7.3.4 stable.

From:	Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	archeron(at)wavefire(dot)com, Chris Kratz <chris(dot)kratz(at)vistashare(dot)com>, pgsql-hackers(at)postgresql(dot)org, Jan Wieck <JanWieck(at)Yahoo(dot)com>
Subject:	Re: invalid tid errors in latest 7.3.4 stable.
Date:	2003-09-30 22:45:45
Message-ID:	20030930134452.E21417@megazone.bigpanda.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, 30 Sep 2003, Tom Lane wrote:

> Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
> > On Fri, 26 Sep 2003, Tom Lane wrote:
> >> Hmm, that is a good point. It would be cleaner to throw a "can't
> >> serialize" failure than have the RI triggers run under a different
> >> snapshot. I am not sure if we can implement that behavior easily,
> >> though. Can you think of a way to detect whether there's an RI conflict
> >> against a later-started transaction?
>
> > Not a complete one yet. :(
>
> After working through the cases it doesn't seem like this would be that
> hard to do. Here's my proposal:
>
> 1. When running in a READ COMMITTED transaction, the RI triggers simply
> need to do SetQuerySnapshot before acting. This will update the snap so
> that they can see all rows they need to see; they don't need to fool
> with using SnapshotNow. Any triggers or rules fired indirectly as a
> result of RI actions can use this same snapshot. No inconsistencies.

This seems right. The locks should keep anyone else's hands off if there
were any possibility of conflicts.

> 2. When running in a SERIALIZABLE transaction, our objective is to use
> the transaction's serializable snapshot for triggers and rules fired by
> RI actions. Therefore the RI actions should throw "can't serialize"
> errors if they would need to affect rows that are not visible under this
> snapshot. To see what's involved, break down the problem into cases:
>
> * FK->PK actions (that is, checking a matching PK row exists; we're
> trying to do a SELECT FOR UPDATE on the PK table). Four cases:
> 1. PK row existed before our xact, still exists: no problem
> 2. PK row existed before our xact, updated: no problem really,
> but would be okay to throw serialization error
> 3. PK row existed before our xact, deleted: must throw error,
> but does it matter if we say "serialization error"
> or "key not found"?
> 4. no PK row existed before our xact: really should throw a
> "key not found" error even if a row has since been
> inserted.
>
> 7.3 and CVS tip in fact throw can't-serialize errors in both cases 2 and
> 3. They get case 4 wrong: they will allow an FK to be inserted even
> though the PK row is too young to be visible to our transaction.
>
> I claim that all we need to do here is run the SELECT FOR UPDATE under
> the transaction's serializable snapshot. That will give the same
> behavior as the existing code for cases 2 and 3, and will give what
> I claim is the right behavior (throw error) for case 4.

An error for 2-4 seems reasonable here. I wonder how easily the behavior
for 2 and 3 will carry over when we switch away from FOR UPDATE on the
constraint checks, but we can worry about that then.

> * PK->FK actions (cascading from a PK update or delete):
> 1. FK row existed before our xact, still exists: no problem,
> just perform the action
> 2. FK row existed before our xact, updated: must throw serialization
> error since we cannot update row in SERIALIZABLE mode
> 3. FK row existed before our xact, deleted: ideally should throw
> serialization error, but would probably be okay to ignore row too
> 4. FK row inserted during our xact: must throw error, but could
> call it either serialization or key-exists.
>
> 7.3 throws can't-serialize errors in cases 2 and 3, and a key-exists
> error in case 4.
>
> We cannot implement case 4 correctly if we do the scan under our
> serialization snapshot (we'll fail to see the row at all). Seems what
> we must do is scan using a current snapshot (from a SetQuerySnapshot
> taken when the RI trigger is entered), and then throw serialization
> error if we find any rows to update or delete that would not be visible
> under the transaction's serializable snapshot. If the rows are visible,
> it's okay to operate on them with the serializable snapshot as
> QuerySnapshot for rules or triggers. Note that this implementation
> means that case 3 will not throw errors, because such rows will be
> ignored by the scan. I think this is an okay tradeoff for getting the
> other cases right.

It's probably better than the current situation anyway. I think that it'd
be revisitable if someone wanted to bring it up later.

> The Executor API extension that we need to implement this behavior
> is to pass in a second "crosscheck" snapshot against which tuples to be
> updated/deleted are checked; if they don't pass the crosscheck then
> throw serialization error. This only requires minimal mods in
> execMain.c. The snapshot-creation code in tqual.c will also need an
> extra entry point so that the RI triggers can create a fresh snapshot
> even in SERIALIZABLE mode (SetQuerySnapshot won't do this).

I'd have figured that this would be a bigger deal to implement cleanly,
but if it isn't, then it sounds good. I'm a little worried because
there's been talk of beta and this going in now for fear of breaking
something worse than it already is, but if you think that it's safe
enough.

In response to

Re: invalid tid errors in latest 7.3.4 stable. at 2003-09-30 20:10:04 from Tom Lane

Responses

Re: invalid tid errors in latest 7.3.4 stable. at 2003-09-30 23:14:25 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2003-09-30 23:14:25	Re: invalid tid errors in latest 7.3.4 stable.
Previous Message	Tom Lane	2003-09-30 22:26:52	Re: [SQL] plpgsql doesn't coerce boolean expressions to