Deferred constraint trigger semantics

From: Bryn Llewellyn <bryn(at)yugabyte(dot)com>
To: pgsql-general list <pgsql-general(at)lists(dot)postgresql(dot)org>
Cc: laurenz(dot)albe(at)cybertec(dot)at
Subject: Deferred constraint trigger semantics
Date: 2022-05-11 00:46:59
Message-ID: 5C29B392-6215-4791-8DD8-7B8977A637B7@yugabyte.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

SUMMARY
=======

I looked at the sections "CREATE TRIGGER" and "Chapter 39. Triggers" in the Current PG doc. But I failed to find any information about the semantics of the deferred constraint trigger or about the use cases that motivated this feature. Nor could I find any code examples. Internet Search turned up this 2019 post by Laurenz Albe's—but nothing else at all.

https://www.cybertec-postgresql.com/en/triggers-to-enforce-constraints/

(This is why I CC'd you, Laurenz.)

Laurenz described a use case that's best met by a SQL Assertion. But I don't know of a single available relational database system that supports this feature—though the SQL Standard defines it. (There's been talk of bringing SQL Assertion support in Oracle Database for several years. See https://community.oracle.com/tech/apps-infra/discussion/4390732/sql-assertions-declarative-multi-row-constraints <https://community.oracle.com/tech/apps-infra/discussion/4390732/sql-assertions-declarative-multi-row-constraints>. But I don't know if any progress has been made.)

Laurenz's example treats the semantics as if the deferred constraint trigger fires exactly once, the moment before commit, so that it can check that the business rule is met and, if so, perform the commit before a concurrent session is allowed to commit a change that would, in such a race condition and together with the present session's commit, cause rule violation.

Is this a supported use—even though, as I show below, the trigger fires many times and produces the same result each time that it fires? And if so, why cannot the definition, optionally, be spelled "after... statement" and fire that way?

If the use that Laurenz's example shows is supported, then I could have such a trigger on every table that participates in a requirement that's tersely expressed using a SQL Assertion (as documentation). I'd simply use the same trigger function for the deferred constraint trigger on each of those tables. And I'd accept the fact that it unnecessarily (and expensively) fired more times than it needed to. Because I want to test only the final state, and intermediate states might violate the rule that I want to enforce, I need to add logic to populate a log (presumably a temporary table) with an ordinary trigger, every time any involved table changes, so that I could check the log in the constraint trigger's function. When it has N rows, then the function should skip the check except on its Nth invocation.

There's an implicit question lurking here: might it be possible to define a new ON COMMIT trigger firing point, legal only for a deferred constraint trigger? It should fire just once when a multi-statement transaction is committed if one or many of the tables, whose triggers share the same function, see changes.

DETAIL
======

I read this in the Current CREATE TRIGGER section: Constraint triggers must be AFTER ROW triggers on plain tables.

And sure enough, this attempt:

create constraint trigger trg
after insert on t
for each statement
execute function trg_fn();

causes a syntax error. This implies that the concept is to program a constraint that (like a declarative constraint on a table’s column as part of the table's definition) has just single-row scope. Yet I discovered, serendipitously (and before reading Laurenz's 's post), that an “after row” constraint trigger does allow querying the table it's on (and other tables) in the trigger function's body. I used this technique in the example that I used to start this thread:

"A transaction cannot be ended inside a block with exception handlers"
https://www.postgresql.org/message-id/40EB34E2-F9CA-4D29-B0BE-6DF3D4D0FBA3@yugabyte.com <https://www.postgresql.org/message-id/40EB34E2-F9CA-4D29-B0BE-6DF3D4D0FBA3@yugabyte.com>

Nobody said that I was doing anything unsupported.

I also used it in "The complete testcase" at the end of my final response in that thread:
https://www.postgresql.org/message-id/549EB7DB-214E-420C-BEEF-324AB124DE99@yugabyte.com <https://www.postgresql.org/message-id/549EB7DB-214E-420C-BEEF-324AB124DE99@yugabyte.com>

Is my entire concept (and Laurenz's too) fundamentally flawed? Specifically, is querying a trigger's base table in a "for each row" trigger fundamentally unsound and not supported? (In Oracle Database, it causes the notorious "mutating table" runtime error.)

This little test shows what actually happens:

create table t1(k serial primary key, v int not null);
create table t2(k serial primary key, v int not null);

create function trg_fn()
returns trigger
language plpgsql
as $body$
declare
n int not null := 0;
begin
n := (select count(*) from t1) + (select count(*) from t2);
raise info 'trg fired. new.v = %, n = %', new.v, n;
return new;
end;
$body$;

create constraint trigger trg
after insert on t1
for each row
execute function trg_fn();

create constraint trigger trg
after insert on t2
for each row
execute function trg_fn();

set default_transaction_isolation = 'read committed';
do $body$
begin
insert into t1(v)
values (10), (20), (30), (40), (50);

insert into t2(v)
values (60), (70), (80);
end;
$body$;

Here's the "raise info" output:

INFO: trg fired. new.v = 10, n = 5
INFO: trg fired. new.v = 20, n = 5
INFO: trg fired. new.v = 30, n = 5
INFO: trg fired. new.v = 40, n = 5
INFO: trg fired. new.v = 50, n = 5
INFO: trg fired. new.v = 60, n = 8
INFO: trg fired. new.v = 70, n = 8
INFO: trg fired. new.v = 80, n = 8

It shows the expected "new" value each time it fires. And yet the query reflects the table content on statement completion. This seems to be very strange. But it is, in fact, what I need to implement my testcase (below). I want the trigger to cause an error only when the final content is wrong.

As long as I can rely on my approach (i.e. trust the serendipitously useful behavior to be supported), then I can use it to effect to implement, for example, an entity level constraint. I can ignore the inefficiency brought by running the same test on the same data many times.)

(Separate tests show that if I foolishly refer to "old" or "new" values in the trigger's function, then these seem to be "snapshotted" at statement execution time. These values are no use when only the final state is interesting.)

I've copied a self-contained testcase, below, that implements the rule that, in the table of staff, there must be exactly one or two rows with the job "Admin". It works fine in the test that I've done. I used two concurrent sessions that contrived the race condition that I discussed elsewhere.

It shows that, as long as I rely on the commit-time error to enforce the rule, the second session to commit gets the required error when both sessions changed the admin count from one to two by updating different staff members to make their job "Admin".

And it shows, too, that if I test using "set constraints all immediate" before either session commits, and then do the commit in each, neither gets the error and I end up with three staff with the "Admin" job. This is simply the race condition that I expected.

Note: I do appreciate that if I use the serializable isolation level together with an ordinary trigger, then I'll achieve my race-condition-proof rule arguably more straightforwardly—esp. as I could noe use a proper "after statement" firing point.

ONE OR TWO ADMINS USE CASE
==========================

Notice "pg_sleep(10)" right before the end of the trigger function. This gives enough time to start the test in each of two concurrent windows up to, and including "set constraints all immediate" (when this is uncommented) before each session commits.

Try the test first with "set constraints all immediate" uncommented.

The "red" session, because it starts fires, shows that the, so far, rule is met when it finishes. But when the "blue" session finishes, it shows that the final state now violates the rule.

Now try the test first with "set constraints all immediate" commented out.

The "red" session finishes first and reports a good state. Then the blue session finishes with this error:

ERROR: There must be exactly one or two Admins
CONTEXT: PL/pgSQL function t_constraint_fn() line 6 at RAISE

and its attempts to change the data are rolled back—so that the rule still holds.

-- do-setup.sql
---------------
drop table if exists staff;
create table staff(
name text primary key,
job text not null
constraint staff_job_chk check (job in (
'Manager', 'Admin', 'Sales', 'Marketing', 'Developer')));

drop function if exists t_constraint_fn();
create function t_constraint_fn()
returns trigger
language plpgsql
as $body$
declare
bad constant boolean not null :=
(select count(*) from staff where job = 'Admin') not in (1, 2);
begin
if bad then raise exception using
errcode = 'raise_exception',
message = 'There must be exactly one or two Admins';
end if;
return null;
end;
$body$;

create constraint trigger t_constraint
after insert or update on staff
initially deferred
for each row
execute function t_constraint_fn();

insert into staff(name, job) values
('Bill', 'Marketing'),
('Fred', 'Sales'),
('John', 'Admin'),
('Mary', 'Manager'),
('Maude', 'Developer'),
('Susan', 'Sales');

drop function if exists test_outcome(text) cascade;
create function test_outcome(which in text)
returns text
language plpgsql
as $body$
declare
var text not null := '';
begin
case which
when 'red' then
update staff set job = 'Admin' where name = 'Bill';
when 'blue' then
update staff set job = 'Admin' where name = 'Fred';
end case;

-- Uncomment to see the race-condition problem.
-- set constraints all immediate;

perform pg_sleep(10);
return 'Sucess.';
exception when raise_exception then
return '"One or two admins" rule violated.';
end;
$body$;

-- qry.sql
----------
select
case
when job = 'Admin' then ' >>>'
else ' '
end
as is_staff,
name,
job
from staff
order by name;

-- demo.sql
-- In the "red" session.
\i do-setup.sql
set default_transaction_isolation = 'read committed';
select test_outcome('red');
\i qry.sql

--------------------------------------------------------------------------------
/*
-- In the concurrent "blue" session.
-- Do this ASAP after running "demo.sql" in the "red" session.
set default_transaction_isolation = 'read committed';
select test_outcome('blue');
\i qry.sql
*/;

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2022-05-11 01:19:47 Re: Deferred constraint trigger semantics
Previous Message Peter J. Holzer 2022-05-10 22:43:54 Re: PLPGSQL - extra column existence in trigger