Eliminating SPI / SQL from some RI triggers - take 3

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Eliminating SPI / SQL from some RI triggers - take 3
Date: 2024-12-20 04:23:35
Message-ID: CA+HiwqF4C0ws3cO+z5cLkPuvwnAwkSp7sfvgGj3yQ=Li6KNMqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

We discussed $subject at [1] and [2] and I'd like to continue that
work with the hope to commit some part of it for v18.

In short, performing the RI checks for inserts and updates of a
referencing table as direct scans of the PK index results in up to 40%
improvement in their performance, especially when they are done in a
bulk manner as shown in the following example:

create unlogged table pk (a int primary key);
insert into pk select generate_series(1, 10000000);
insert into fk select generate_series(1, 10000000);

On my machine, the last query took 20 seconds with master, whereas 12
seconds with the patches. With master, a significant portion of the
time can be seen spent in ExecutorStart() and ExecutorEnd() on the
plan for the RI query, which adds up as it's done for each row in a
bulk load. Patch avoids that overhead because it calls the index AM
directly.

The patches haven't changed in the basic design since the last update
at [2], though there are few changes:

1. I noticed a few additions to the RI trigger functions the patch
touches, such as those to support temporal foreign keys. I decided to
leave the SQL for temporal queries in place as the plan for those
doesn't look, on a glance, as simple as a simple index scan.

2. As I mentioned in [3], the new way of doing the PK lookup didn't
have a way to recheck the PK tuple after detecting concurrent updates
of the PK, so would cause an error under READ COMMITTED isolation
level. The old way of executing an SQL plan would deal with that
using the EvalPlanQual() mechanism in the executor. In the updated
patch, I've added an equivalent rechecking function that's called in
the same situations as EvalPlanQual() would get called in the old
method.

3. I reordered the patches as Robert suggested at [5]. Mainly because
the patch set includes changes to address a bug where PK lookups could
return incorrect results under the REPEATABLE READ isolation level.
This issue arises because RI lookups on partitioned PK tables
manipulate ActiveSnapshot to pass the snapshot that's used by
find_inheritance_children() to determine the visibility of
detach-pending partitions to these RI lookups. To address this, the
patch set introduces refactoring of the PartitionDesc interface,
included in patch 0001. This refactoring eliminates the need to
manipulate ActiveSnapshot by explicitly passing the correct snapshot
for detach-pending visibility handling. The main patch (0002+0003),
which focuses on improving performance by avoiding SQL queries for RI
checks, builds upon these refactoring changes to pass the snapshot
directly instead of manipulating the ActiveSnapshot. Reordering the
patches this way ensures a logical progression of changes, as Robert
suggested, while avoiding any impression that the bug was introduced
by the ri_triggers.c changes.

However, I need to spend some time addressing Robert's feedback on the
basic design, as outlined at [5]. Specifically, the new PK lookup
function could benefit significantly from caching information rather
than recomputing it for each row. This implies that the PlanCreate
function should create a struct to store reusable information across
PlanExecute calls for different rows being checked.

Beyond implementing these changes, I also need to confirm that the new
plan execution preserves all operations performed by the SQL plan for
the same checks, particularly those affecting user-visible behavior.
I've already verified that permission checks are preserved: revoking
access to the PK table during the checks causes them to fail, as
expected. This behavior is maintained because permission checks are
performed during each execution. The planned changes to separate the
"plan" and "execute" steps should continue to uphold this and other
behaviors that might need to be preserved.

--
Thanks, Amit Langote

[1] Simplifying foreign key/RI checks:
https://www.postgresql.org/message-id/flat/CA%2BHiwqG5e8pk8s7%2B7zhr1Nc_PGyhEdM5f%3DpHkMOdK1RYWXfJsg%40mail.gmail.com

[2] Eliminating SPI from RI triggers - take 2
https://www.postgresql.org/message-id/flat/CA%2BHiwqG5e8pk8s7%2B7zhr1Nc_PGyhEdM5f%3DpHkMOdK1RYWXfJsg%40mail.gmail.com

[3] https://www.postgresql.org/message-id/CA+TgmoaiTNj4DgQy42OT9JmTTP1NWcMV+ke0i=+a7=VgnzqGXw@mail.gmail.com

[4] https://www.postgresql.org/message-id/CA%2BTgmoa1DCQ0MdojD9o6Ppbfj%3DabXxe4FUkwA4O_6qBHwOMVjw%40mail.gmail.com

[5] https://www.postgresql.org/message-id/CA+TgmoaiTNj4DgQy42OT9JmTTP1NWcMV+ke0i=+a7=VgnzqGXw@mail.gmail.com

Attachment Content-Type Size
v1-0003-Avoid-using-an-SQL-query-for-some-RI-checks.patch application/x-patch 49.4 KB
v1-0001-Explicitly-pass-snapshot-necessary-for-omit_detac.patch application/x-patch 17.3 KB
v1-0002-Avoid-using-SPI-in-RI-trigger-functions.patch application/x-patch 32.8 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2024-12-20 05:20:38 Speed up ICU case conversion by using ucasemap_utf8To*()
Previous Message jian he 2024-12-20 02:53:06 Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).