From: | Antonin Houska <ah(at)cybertec(dot)at> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | More efficient RI checks - take 2 |
Date: | 2020-04-08 16:38:01 |
Message-ID: | 1813.1586363881@antos |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
After having reviewed [1] more than a year ago (the problem I found was that
the transient table is not available for deferred constraints), I've tried to
implement the same in an alternative way. The RI triggers still work as row
level triggers, but if multiple events of the same kind appear in the queue,
they are all passed to the trigger function at once. Thus the check query does
not have to be executed that frequently.
Some performance comparisons are below. (Besides the execution time, please
note the difference in the number of trigger function executions.) In general,
the checks are significantly faster if there are many rows to process, and a
bit slower when we only need to check a single row. However I'm not sure about
the accuracy if only a single row is measured (if a single row check is
performed several times, the execution time appears to fluctuate).
Comments are welcome.
Setup
=====
CREATE TABLE p(i int primary key);
INSERT INTO p SELECT x FROM generate_series(1, 16384) g(x);
CREATE TABLE f(i int REFERENCES p);
Insert many rows into the FK table
==================================
master:
EXPLAIN ANALYZE INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Insert on f (cost=0.00..163.84 rows=16384 width=4) (actual time=32.741..32.741 rows=0 loops=1)
-> Function Scan on generate_series g (cost=0.00..163.84 rows=16384 width=4) (actual time=2.403..4.802 rows=16384 loops=1)
Planning Time: 0.050 ms
Trigger for constraint f_i_fkey: time=448.986 calls=16384
Execution Time: 485.444 ms
(5 rows)
patched:
EXPLAIN ANALYZE INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Insert on f (cost=0.00..163.84 rows=16384 width=4) (actual time=34.053..34.053 rows=0 loops=1)
-> Function Scan on generate_series g (cost=0.00..163.84 rows=16384 width=4) (actual time=2.223..4.448 rows=16384 loops=1)
Planning Time: 0.047 ms
Trigger for constraint f_i_fkey: time=105.164 calls=8
Execution Time: 141.201 ms
Insert a single row into the FK table
=====================================
master:
EXPLAIN ANALYZE INSERT INTO f VALUES (1);
QUERY PLAN
------------------------------------------------------------------------------------------
Insert on f (cost=0.00..0.01 rows=1 width=4) (actual time=0.060..0.060 rows=0 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
Planning Time: 0.026 ms
Trigger for constraint f_i_fkey: time=0.435 calls=1
Execution Time: 0.517 ms
(5 rows)
patched:
EXPLAIN ANALYZE INSERT INTO f VALUES (1);
QUERY PLAN
------------------------------------------------------------------------------------------
Insert on f (cost=0.00..0.01 rows=1 width=4) (actual time=0.066..0.066 rows=0 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
Planning Time: 0.025 ms
Trigger for constraint f_i_fkey: time=0.578 calls=1
Execution Time: 0.670 ms
Check if FK row exists during deletion from the PK
==================================================
master:
DELETE FROM p WHERE i=16384;
ERROR: update or delete on table "p" violates foreign key constraint "f_i_fkey" on table "f"
DETAIL: Key (i)=(16384) is still referenced from table "f".
Time: 3.381 ms
patched:
DELETE FROM p WHERE i=16384;
ERROR: update or delete on table "p" violates foreign key constraint "f_i_fkey" on table "f"
DETAIL: Key (i)=(16384) is still referenced from table "f".
Time: 5.561 ms
Cascaded DELETE --- many PK rows
================================
DROP TABLE f;
CREATE TABLE f(i int REFERENCES p ON UPDATE CASCADE ON DELETE CASCADE);
INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
master:
EXPLAIN ANALYZE DELETE FROM p;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Delete on p (cost=0.00..236.84 rows=16384 width=6) (actual time=38.334..38.334 rows=0 loops=1)
-> Seq Scan on p (cost=0.00..236.84 rows=16384 width=6) (actual time=0.019..3.925 rows=16384 loops=1)
Planning Time: 0.049 ms
Trigger for constraint f_i_fkey: time=31348.756 calls=16384
Execution Time: 31390.784 ms
patched:
EXPLAIN ANALYZE DELETE FROM p;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Delete on p (cost=0.00..236.84 rows=16384 width=6) (actual time=33.360..33.360 rows=0 loops=1)
-> Seq Scan on p (cost=0.00..236.84 rows=16384 width=6) (actual time=0.012..3.183 rows=16384 loops=1)
Planning Time: 0.094 ms
Trigger for constraint f_i_fkey: time=9.580 calls=8
Execution Time: 43.941 ms
Cascaded DELETE --- a single PK row
===================================
INSERT INTO p SELECT x FROM generate_series(1, 16384) g(x);
INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
master:
DELETE FROM p WHERE i=16384;
DELETE 1
Time: 5.754 ms
patched:
DELETE FROM p WHERE i=16384;
DELETE 1
Time: 8.098 ms
Cascaded UPDATE - many rows
===========================
master:
EXPLAIN ANALYZE UPDATE p SET i = i + 16384;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Update on p (cost=0.00..277.80 rows=16384 width=10) (actual time=166.954..166.954 rows=0 loops=1)
-> Seq Scan on p (cost=0.00..277.80 rows=16384 width=10) (actual time=0.013..7.780 rows=16384 loops=1)
Planning Time: 0.177 ms
Trigger for constraint f_i_fkey on p: time=60405.362 calls=16384
Trigger for constraint f_i_fkey on f: time=455.874 calls=16384
Execution Time: 61036.996 ms
patched:
EXPLAIN ANALYZE UPDATE p SET i = i + 16384;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Update on p (cost=0.00..277.77 rows=16382 width=10) (actual time=159.512..159.512 rows=0 loops=1)
-> Seq Scan on p (cost=0.00..277.77 rows=16382 width=10) (actual time=0.014..7.783 rows=16382 loops=1)
Planning Time: 0.146 ms
Trigger for constraint f_i_fkey on p: time=169.628 calls=9
Trigger for constraint f_i_fkey on f: time=124.079 calls=2
Execution Time: 456.072 ms
Cascaded UPDATE - a single row
==============================
master:
UPDATE p SET i = i - 16384 WHERE i=32767;
UPDATE 1
Time: 4.858 ms
patched:
UPDATE p SET i = i - 16384 WHERE i=32767;
UPDATE 1
Time: 11.955 ms
[1] https://commitfest.postgresql.org/22/1975/
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
Attachment | Content-Type | Size |
---|---|---|
v01-0001-Check-for-RI-violation-outside-ri_PerformCheck.patch | text/x-diff | 2.4 KB |
v01-0002-Changed-ri_GenerateQual-so-it-generates-the-whole-qu.patch | text/x-diff | 13.8 KB |
v01-0003-Return-early-from-ri_NullCheck-if-possible.patch | text/x-diff | 1011 bytes |
v01-0004-Process-multiple-RI-trigger-events-at-a-time.patch | text/x-diff | 79.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-04-08 16:39:53 | Re: Commitfest 2020-03 Now in Progress |
Previous Message | David Steele | 2020-04-08 16:36:37 | Re: Commitfest 2020-03 Now in Progress |