Re: BUG #14808: V10-beta4, backend abort

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andrew Gierth <rhodiumtoad(at)postgresql(dot)org>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: BUG #14808: V10-beta4, backend abort
Date: 2017-09-14 21:45:55
Message-ID: 630.1505425555@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Attached is a draft patch for this.

Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
> 1. Merging the transition tables when there are multiple wCTEs
> referencing the same table. Here's one idea: Rename
> MakeTransitionCaptureState() to GetTransitionCaptureState() and use a
> hash table keyed by table OID in
> afterTriggers.transition_capture_states[afterTriggers.query_depth] to
> find the TCS for the given TriggerDesc or create it if not found, so
> that all wCTEs find the same TransitionCaptureState object. The all
> existing callers continue to do what they're doing now, but they'll be
> sharing TCSs appropriately with other things in the plan. Note that
> TransitionCaptureState already holds tuplestores for each operation
> (INSERT, UPDATE, DELETE) so the OID of the table alone is a suitable
> key for the hash table (assuming we are ignoring the column-list part
> of the spec as you suggested).

It seems unsafe to merge the TCS objects themselves, because the callers
assume that they can munge the tcs_map and tcs_original_insert_tuple
fields freely without regard for any other callers. So as I have it,
we still have a TCS for each caller, but the TCSes point at tuplestores
that can be shared across multiple callers for the same event type.
The tuplestores themselves are managed by the AfterTrigger data
structures. Also, because the TCS structs are just throwaway per-caller
data, it's uncool to reference them in the trigger event lists.
So I replaced ats_transition_capture with two pointers to the actual
tuplestores. That bloats AfterTriggerSharedData a bit but I think it's
okay; we don't expect a lot of those structs in a normal query.

I chose to make the persistent state (AfterTriggersTableData) independent
for each operation type. We could have done that differently perhaps, but
it seemed more complicated and less likely to match the spec's semantics.

The INSERT ON CONFLICT UPDATE mess is handled by creating two separate
TCSes with two different underlying AfterTriggersTableData structs.
The insertion tuplestore sees only the inserted tuples, the update
tuplestores see only the updated-pre-existing tuples. That adds a little
code to nodeModifyTable but it seems conceptually much cleaner.

> 2. Hiding the fact that we implement fk CASCADE using another level
> of queries. Perhaps we could arrange for
> afterTriggers.transition_capture_states[afterTriggers.query_depth] to
> point to the same hash table as query_depth - 1, so that the effects
> of statements at this implementation-internal level appear to the user
> as part of the the level below?

That already happens, because query_depth doesn't increment for an FK
enforcement query --- we never call AfterTriggerBegin/EndQuery for it.

> 3. Merging the invocation after statement firing so that if you
> updated the same table directly and also via a wCTE and also
> indirectly via fk ON DELETE/UPDATE trigger then you still only get one
> invocation of the after statement trigger. Not sure exactly how...

What I did here was to use the AfterTriggersTableData structs to hold
a flag saying we'd already queued statement triggers for this rel and
cmdType. There's probably more than one way to do that, but this seemed
convenient.

One thing I don't like too much about that is that it means there are
cases where the single statement trigger firing would occur before some
AFTER ROW trigger firings. Not sure if we promise anything about the
ordering in the docs. It looks quite expensive/complicated to try to
make it always happen afterwards, though, and it might well be totally
impossible if triggers cause more table updates to occur.

Because MakeTransitionCaptureState now depends on the trigger query
level being active, I had to relocate the AfterTriggerBeginQuery calls
to occur before it.

In passing, I refactored the AfterTriggers data structures a bit so
that we don't need to do as many palloc calls to manage them. Instead
of several independent arrays there's now one array of structs.

regards, tom lane

Attachment Content-Type Size
merge-transition-tables-1.patch text/x-diff 71.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2017-09-15 00:12:35 Re: BUG #14808: V10-beta4, backend abort
Previous Message Peter Eisentraut 2017-09-14 21:15:53 Re: BUG #14813: pg_get_serial_sequence does not return seqence name for IDENTITY columns