Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Tender Wang <tndrwang(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Baehler Thomas SBB CFF FFS <thomas(dot)baehler2(at)sbb(dot)ch>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Subject: Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Date: 2024-09-02 21:01:47
Message-ID: 20240902230147.0d3958bc@karst
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Tue, 20 Aug 2024 23:09:27 -0400
Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:

> On 2024-Aug-20, Jehan-Guillaume de Rorthais wrote:
>
> > I'm back on this issue as well. I start poking at this patch to review it,
> > test it, challenge it and then report here.
> >
> > I'll try to check if some other issues might have lost/forgot on they way as
> > well.
>
> Thanks, much appreciated, looking forward to your feedback.

Sorry, it took me a while to come back to you on this topic. It has been hard to
untangle subjects, reproductions and patch…

There's three distinct issues/thread:

* Constraint & trigger catalog cleanup [1] (this thread)
* FK broken after DETACH [2]
* Maintenance consideration about self referencing FK between partitions [3]

0. Splitting in two commits

Your patch addresses two bugs:

* one for the constraint & trigger catalog cleanup;
* one for the FK broken after DETACH.

These issues are unrelated, therefore I am wondering if it would be better
to split their resolution in two different patches.

Last year, I reported them in two different threads [1][2]. The first with
implementation consideration, the second with a demo/proposal/draft fix.

Unfortunately, this discussion about the first bug slipped to the second one
when Tender stumbled on this bug as well and reported it. But, both bugs can
be triggered independently, and have distinct fixes.

Finally, splitting the patch might help setting finer patch co-authoring. I
know my patch for [2] was a draft and somewhat trivial, but I spend a fair
amount of time to report, then produce a draft patch, so I was wondering if
it would be candidate to a co-author flag on this (small, humble and
refactored by you) patch?

I'm definitely not involved (yet) in the second part though.

1. Constraint & trigger catalog cleanup [1]

I have been focusing on the current master branch and haven't taken into
consideration backpatching related issues yet.

When I first studied this bug and reported it, I held on writing a patch
because it seemed it would duplicate some existing code. I wrote:

> I poked around DetachPartitionFinalize() to try to find a way to fix this,
> but it looks like it would duplicate a bunch of code from other code path
> (eg. from CloneFkReferenced).

My proposal was to clean everything related to the old FK and use some
existing code path to create a fresh and cleaner one. This requires some
refactoring in existing code, but we would win a common path of code between
create/attach/detach, a cleaner catalog and easier code maintenance.

I've finally been able to write a PoC that implement this by calling
addFkRecurseReferenced() from DetachPartitionFinalize(). I can't join
it here because it is currently an ugly draft and I still have some work
to do. But I would really like to have a little more time (one or two days) to
explore this avenue further before you commit yours, if you don't mind? Or
maybe you already have considered this avenue and rejected it?

2. FK broken after DETACH [2]

Comparing your patch to my draft from [2], I just have a question about the
refactoring.

Fencing the constraint/trigger removal inside a conditional
RELKIND_PARTITIONED_TABLE block of code was obvious. It avoids some useless
catalog scan compared to my draft patch.

Also, the "contype == CONSTRAINT_FOREIGN" I had sounds safe to remove.

However, is it clean/light enough to add the "conparentid == fk->conoid" in
the scan key as I did? I'm not sure it saves anything else but the small
conditional block you inserted inside the loop, but I wonder if there's a
serious concern about this anyway?

Last, considering the tests, I think we should add some rows in the tables,
to make sure the FK is correctly enforced after DETACH. Something like:

CREATE SCHEMA fkpart12
CREATE TABLE fk_p ( id bigint PRIMARY KEY ) PARTITION BY list (id)
CREATE TABLE fk_p_1 PARTITION OF fk_p FOR VALUES IN (1)
CREATE TABLE fk_p_2 PARTITION OF fk_p FOR VALUES IN (2)
CREATE TABLE fk_r_1 ( id bigint PRIMARY KEY, p_id bigint NOT NULL)
CREATE TABLE fk_r_2 ( id bigint PRIMARY KEY, p_id bigint NOT NULL)
CREATE TABLE fk_r ( id bigint PRIMARY KEY, p_id bigint NOT NULL,
FOREIGN KEY (p_id) REFERENCES fk_p (id)
) PARTITION BY list (id);
SET search_path TO fkpart12;

INSERT INTO fk_p VALUES (1);

ALTER TABLE fk_r ATTACH PARTITION fk_r_2 FOR VALUES IN (2);

ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1);
\d fk_r_1

INSERT INTO fk_r VALUES (1,1);

ALTER TABLE fk_r DETACH PARTITION fk_r_1;
\d fk_r_1

INSERT INTO c_1 VALUES (2,2); -- fails as EXPECTED
DELETE FROM p; -- should fails but was buggy

ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1);
\d fk_r_1

3. Self referencing FK between partitions [3]

You added to your commit message:

verify: 20230707175859(dot)17c91538(at)karst

I'm not sure what the "verify" flag means. Unfortunately, your patch doesn't
help on this topic.

This bug really needs more discussion and design consideration. I have
thought about this problem and haven't found any solution that don't involve
breaking the current core behavior. It really looks like an impossible bug to
fix without dropping the constraint itself. On both side. Maybe the only sane
behavior would be to forbid detaching the partition if it would break the
constraint.

But let's discuss this on the related thread, should we?

Thank you for reading me all the way down to the bottom!

Regards,

[1] https://www.postgresql.org/message-id/20230705233028.2f554f73%40karst
[2] https://www.postgresql.org/message-id/20230420144344.40744130%40karst
[3] https://www.postgresql.org/message-id/20230707175859.17c91538%40karst

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2024-09-02 21:45:53 Re: Improving tracking/processing of buildfarm test failures
Previous Message Tom Lane 2024-09-02 19:56:40 Re: thread-safety: strerror_r()