Re: BUG #18559: Crash after detaching a partition concurrently from another session

From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-08-13 12:12:29
Message-ID: CAGz5QCKGAhdN_Aq109zRAv44nJfZw2Wc6Fx22EZZ-UmX-WGFVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Aug 13, 2024 at 4:03 AM Alvaro Herrera from 2ndQuadrant
<alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2024-Aug-12, Alvaro Herrera from 2ndQuadrant wrote:
>
> > On 2024-Aug-10, Junwang Zhao wrote:
> >
> > > > IIUC, a simple fix would be to retry if an entry is not found. Attached a patch.
> > >
> > > I can reproduce the issue, and the patch LGTM.
> >
> > Interesting issue, thanks for reporting and putting together a
> > reproducer. I have added some comments to the proposed patch, so here's
> > a v2 for it. I'm going to write a commit message for it and push to all
> > branches since 14.
>
> Dept. of second thoughts. I couldn't find any reason why it's okay to
> dereference the return value from systable_getnext() without verifying
> that it's not NULL, so I added the check to the older branches too. As
> far as I know we've never had a crash report that could be traced to
> lack of that check, but that code still looks like it's assuming a
> little too much.
+1.

>
> One more thing here. Applying the test scripts that I used for the
> previous bug with addition of DROP after the DETACH CONCURRENTLY, I get
> a different failure during planning, which reports this error:
>
> ERROR: could not open relation with OID 457639
> STATEMENT: select * from p where a = $1;
>
> #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo(at)entry=6, no_tid=no_tid(at)entry=0) at ./nptl/pthread_kill.c:44
> #1 0x00007f149c3cae8f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
> #2 0x00007f149c37bfb2 in __GI_raise (sig=sig(at)entry=6) at ../sysdeps/posix/raise.c:26
> #3 0x00007f149c366472 in __GI_abort () at ./stdlib/abort.c:79
> #4 0x0000557db57b8611 in errfinish (filename=<optimized out>, lineno=61, funcname=0x557db5816128 <__func__.1> "relation_open")
> at ../../../../../../pgsql/source/master/src/backend/utils/error/elog.c:599
> #5 0x0000557db52441ad in relation_open (relationId=17466, lockmode=lockmode(at)entry=1)
> at ../../../../../../pgsql/source/master/src/backend/access/common/relation.c:61
> #6 0x0000557db537d1d9 in table_open (relationId=<optimized out>, lockmode=lockmode(at)entry=1)
> at ../../../../../../pgsql/source/master/src/backend/access/table/table.c:44
> #7 0x0000557db55b0cf8 in expand_partitioned_rtentry (root=root(at)entry=0x557db9c8b530, relinfo=relinfo(at)entry=0x557db9c8c2d8,
> parentrte=parentrte(at)entry=0x557db9c8c178, parentRTindex=parentRTindex(at)entry=1, parentrel=parentrel(at)entry=0x7f149bc46060, parent_updatedCols=0x0,
> top_parentrc=0x0, lockmode=1) at ../../../../../../pgsql/source/master/src/backend/optimizer/util/inherit.c:390
That means - after getting the live partitions from
prune_append_rel_partitions(), by the time the code tries to lock a
child, it's already dropped.

Able to reproduce the issue with same steps with a small tweak,

Session 1:
1. Continue till DetachPartitionFinalize.

Session 2:
1. Continue till expand_partitioned_rtentry(). It'll find two live
partition after calling prune_append_rel_partitions().

Session 1:
1. Run to completion.
2. SQL: drop table p2;

Session 2:
1. Continue

The table_open will thrown the error - "could not open relation with OID".

The function find_inheritance_children_extended deals with the missing
partition as following:
/* Get the lock to synchronize against concurrent drop */
LockRelationOid(inhrelid, lockmode);

/*
* Now that we have the lock, double-check to see if the relation
* really exists or not. If not, assume it was dropped while we
* waited to acquire lock, and ignore it.
*/
if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(inhrelid)))
{
/* Release useless lock */
UnlockRelationOid(inhrelid, lockmode);
/* And ignore this relation */
continue;
}
However, similar check is not there in expand_partitioned_rtentry().
Introducing the same check will fix the issue. But, I don't know how
it affects the pruning part as this partition couldn't be pruned
earlier and that's why we're opening the child partition.

The key points I see are:
1. The find_inheritance_children_extended() function includes a
partition that's being detached based on the current snapshot.
2. Later in the code path, the expand_partitioned_rtentry function
takes a heavy-weight lock on the partitions.

Between these two steps, some code paths expect the child partition to
exist in the syscache, which leads to errors or crashes.

--
Thanks & Regards,
Kuntal Ghosh

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2024-08-13 12:29:57 BUG #18582: fixed range of search for empty slot in SLRU
Previous Message Tomas Vondra 2024-08-13 12:12:27 Re: FDW INSERT batching can change behavior