Quick Links

Re: Referential Integrity Checks with Statement-level Triggers

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Referential Integrity Checks with Statement-level Triggers
Date:	2018-12-17 16:11:07
Message-ID:	CAFj8pRCQe9=tX5GmC-S_Dmms9tdU4OWGD2xaHZFm-yaMS1zmoQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

po 17. 12. 2018 v 15:32 odesílatel Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
napsal:

> Back when Pg added statement-level triggers, I was interested in the
> potential promise of moving referential integrity checks to statement-level
> triggers.
>
> The initial conversation, along with Kevin Grittner's POC script (in SQL)
> that showed a potential for a 98% reduction in time spent doing RI checks.
> The original thread is here:
>
>
> https://www.postgresql.org/message-id/CACjxUsM4s9%3DCUmPU4YFOYiD5f%3D2ULVDBjuFSo20Twe7KbUe8Mw%40mail.gmail.com
>
> I dug around in the code, and was rather surprised at how close we already
> are to implementing this. The function RI_Initial_Check() already does a
> left-join query via SPI to look for any invalid data, so if we could just
> replace the near table with the transition table for inserted rows, we'd be
> home free. The function SPI_register_trigger_data() makes the transition
> tables visible to SPI, so I started to wonder why this hadn't be done
> already.
>
> I approached Kevin and Thomas Munro seeking feedback on my approach. I
> also made it into a session at the PgConf.ASIA un-conference, and then
> later with Michael Paquier at that same conference, and the coalesced
> feedback was this:
>
> - the overhead of registering the transition tables probably makes it
> unprofitable for single row inserts
> - the single row overhead is itself significant, so maybe the transition
> tables aren't worse
> - there has been talk of replacing transition tables with an in-memory
> data structure that would be closer to "free" from a startup perspective
> and might even coalesce the transition tables of multiple statements in the
> same transaction
> - because no declarative code changes, it's trivial to switch from row
> level to statement level triggering via pg_upgrade
> - assuming that transition tables are an overhead that only pays off when
> > N rows have been updated, does it make sense to enforce RI with something
> that isn't actually a trigger?
> - there was also some mention that parallel query uses a queue mechanism
> that might be leveraged to do row-level triggers for updates of <= N rows
> and statement level for > N
>
> That's what I have so far. I'm going to be working on a POC patch so that
> I can benchmark a pure-statement-level solution, which if nothing else will
> let us know the approximate value of N.
>
> All suggestions are appreciated.
>

It is great. I though little bit about it - just theoretically.

ROW trigger call RI check too often, and statement trigger too less. I
think so ideal design can be call RI check every 10K rows. I think so can
be unfriendly if somebody does very long import and it fails on the end. I
don't think so there should not be any performance difference, if RI check
is called per 1000 or more rows.

Regards

Pavel

In response to

Referential Integrity Checks with Statement-level Triggers at 2018-12-17 14:32:09 from Corey Huinker

Responses

Re: Referential Integrity Checks with Statement-level Triggers at 2018-12-17 17:27:29 from Alvaro Herrera

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alexey Kondratov	2018-12-17 16:23:44	Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message	Antonin Houska	2018-12-17 15:58:03	Re: [HACKERS] WIP: Aggregation push-down