Transition relations: correlating OLD TABLE and NEW TABLE

From: Brent Kerby <blkerby(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Transition relations: correlating OLD TABLE and NEW TABLE
Date: 2018-07-07 03:03:59
Message-ID: CAH8WVsjQ104O3-CNSeFo48ku-qWh0vFCzcvr+K93bax30voJFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In a situation where we're using transition relations to capture changes
after UPDATE statements, i.e., using a trigger of the form

CREATE TRIGGER trigger_name AFTER UPDATE ON table_name
REFERENCING OLD TABLE AS old_table NEW TABLE AS new_table
FOR EACH STATEMENT EXECUTE PROCEDURE procedure();

there seems to be a challenge in how to join `old_table` and `new_table` so
that the old and new version of each row can be matched up with each other.
Of course if the table has a primary key, then this can be used, but I'm
wondering how to handle this in the general case where a primary key might
not exist.

According to this blog (
http://databasedoings.blogspot.com/2017/07/cool-stuff-in-postgresql-10-transition.html)
it is possible to use ROW_NUMBER() OVER () to create a key to join the old
and new tables, but this depends on an implementation detail (or at least,
isn't documented?) that the rows will be returned in the same order for the
two tables. Is it correct that this will work under the existing
implementation?

If there's not a clean way of matching up the old and new versions, could
the transition relation mechanism be extended in order to make this
possible? Here's a couple ideas:

1) A special system column could be added to the two transition relations,
OLD TABLE, and NEW TABLE, providing a common value that could be used to
join corresponding rows; it could be a sequential value (like what
ROW_NUMBER() would generate), or it could be some other unique identifier
for the row that is convenient for implementation. But there's some
awkwardness in the fact that this special column name could clash with the
columns in the table (unless an existing reserved name is used). Also,
exposing a unique row identifier might restrict potential future
implementations.

2) Maybe a cleaner way would be to add a third kind of transition table,
say, a "CHANGE TABLE", which could be used as an alternative to "OLD TABLE"
and "NEW TABLE". A change table could contain just two columns, say
'old_row' and 'new_row', each of which have the appropriate record type. In
this way, the old table and new table are essentially "pre-joined" in the
transition table.

Would this be workable? Or is there some other way of achieving this?

- Brent Kerby

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2018-07-07 03:36:34 Re: Transition relations: correlating OLD TABLE and NEW TABLE
Previous Message Amit Kapila 2018-07-07 02:15:41 Re: Explain buffers wrong counter with parallel plans