Re: Fundamental scheduling bug in parallel restore of partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>, Dimitrios Apostolou <jimis(at)gmx(dot)net>
Subject: Re: Fundamental scheduling bug in parallel restore of partitioned tables
Date: 2025-04-16 09:47:54
Message-ID: CAExHW5vRdKhEwsC7AeGmmUmqBoH_u0Rqx9S5b+YM=wXpC0KKOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 16, 2025 at 12:19 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> writes:
> > On Mon, Apr 14, 2025 at 11:14 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> This is disastrous for assorted reasons. The ALTER ADD CONSTRAINT
> >> command might fail outright if we've loaded data for the referencing
> >> table but not the referenced table.
>
> > There's a comment in getConstraints()
> > /*
> > * Restoring an FK that points to a partitioned table requires that
> > * all partition indexes have been attached beforehand. Ensure that
> > * happens by making the constraint depend on each index partition
> > * attach object.
> > */
>
> Ah, that is an excellent point which I missed. And the INDEX ATTACH
> objects have dependencies on the leaf tables, which *will* get
> repointed to their TABLE DATA objects by repoint_table_dependencies.
> So by the time we are ready to restore the FK CONSTRAINT object,
> we are certain to have loaded all the data of the referenced table.
> But there's nothing delaying the constraint till after the referencing
> table's data is loaded.

Yes.

>
> So at this point we have:
>
> #1: ADD CONSTRAINT failure because of missing referenced data:
> not possible after all.
>

check

> #2: Deadlock between parallel restore jobs: possible in HEAD, but
> it seems likely to be a bug introduced by the not-null-constraint
> work rather than being pg_restore's fault. We have no evidence
> that such a deadlock can happen in released branches, and the lack
> of field reports suggests that it can't.
>

I agree. I ran that repro several times against v17 and never saw a
deadlock, even after changing the number of partitions, sizes of
partitions, adding more tables etc.

Creating an FK constraint either happens before or after the loading
data in one of the partitions. That might point towards either
constraint creation or data load waiting for the other. But I have not
seen an evidence that a full deadlock chain is possible.

> #3: Restoring the FK constraint before referencing data is loaded:
> this seems to be possible, and it's a performance problem, but
> no more than that.
>

Yes. And once we fix this, there won't be waiting between constraint
creation and data load, so the remote possibility of a deadlock also
vanishes.

> So now I withdraw the suggestion that this patch needs to be
> back-patched. We may not even need it in v18, if another fix
> for #2 is found. Fixing #3 would be a desirable thing to do
> in v19, but if that's the only thing at stake then it's not
> something to break feature freeze for.
>
> For the moment I'll mark this CF entry as meant for v19.
> We can resurrect consideration of it for v18 if there's not
> a better way to fix the deadlock problem.

+1.

--
Best Wishes,
Ashutosh Bapat

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Osipov 2025-04-16 09:53:09 Re: Built-in Raft replication
Previous Message Konstantin Osipov 2025-04-16 09:47:00 Re: Built-in Raft replication