Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [bug?] Missed parallel safety checks, and wrong parallel safety
Date: 2021-06-24 03:10:25
Message-ID: CAA4eK1JSbh-hXePx2A0FwFCSDPabBVKkv4z9NJRkMDNJYYLV2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 16, 2021 at 6:10 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tuesday, June 15, 2021 10:01 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > Now, maybe it could be done, and I think that's worth a little more thought. For
> > example, perhaps whenever we invalidate a relation, we could also somehow
> > send some new, special kind of invalidation for its parent saying, essentially,
> > "hey, one of your children has changed in a way you might care about." But
> > that's not good enough, because it only goes up one level. The grandparent
> > would still be unaware that a change it potentially cares about has occurred
> > someplace down in the partitioning hierarchy. That seems hard to patch up,
> > again because of the locking rules. The child can know the OID of its parent
> > without locking the parent, but it can't know the OID of its grandparent without
> > locking its parent. Walking up the whole partitioning hierarchy might be an
> > issue for a number of reasons, including possible deadlocks, and possible race
> > conditions where we don't emit all of the right invalidations in the face of
> > concurrent changes. So I don't quite see a way around this part of the problem,
> > but I may well be missing something. In fact I hope I am missing something,
> > because solving this problem would be really nice.
>
> For partition, I think postgres already have the logic about recursively finding
> the parent table[1]. Can we copy that logic to send serval invalid messages that
> invalidate the parent and grandparent... relcache if change a partition's parallel safety ?
> Although, it means we need more lock(on its parents) when the parallel safety
> changed, but it seems it's not a frequent scenario and looks acceptable.
>
> [1] In generate_partition_qual()
> parentrelid = get_partition_parent(RelationGetRelid(rel), true);
> parent = relation_open(parentrelid, AccessShareLock);
> ...
> /* Add the parent's quals to the list (if any) */
> if (parent->rd_rel->relispartition)
> result = list_concat(generate_partition_qual(parent), my_qual);
>

As shown by me in another email [1], such a coding pattern can lead to
deadlock. It is because in some DDL operations we walk the partition
hierarchy from top to down and if we walk from bottom to upwards, then
that can lead to deadlock. I think this is a dangerous coding pattern
and we shouldn't try to replicate it.

[1] - https://www.postgresql.org/message-id/CAA4eK1LsFpjK5gL%2B0HEvoqB2DJVOi19vGAWbZBEx8fACOi5%2B_A%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-06-24 03:15:48 Re: Decoding speculative insert with toast leaks memory
Previous Message Michael Paquier 2021-06-24 02:21:48 Re: pgbench logging broken by time logic changes