Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943

From: Tender Wang <tndrwang(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: 1026592243(at)qq(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943
Date: 2024-05-14 09:22:08
Message-ID: CAHewXNmJRoeXgFpOopq6Mv8AsTYZuOE2Ap39Us7jKAyYrCEe8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> 于2024年5月14日周二 00:55写道:

> On 2024-Apr-18, Tender Wang wrote:
>
> > Now we switch to gdb2, breakpoint at RelationCacheInvalidateEntry(). We
> > continue gdb2, and we will
> > stop at RelationCacheInvalidateEntry(). And we will see that p relation
> > cache item will be cleared.
> > The backtrace will be attached at the end of the this email.
>

It happend in deconstruct_jointree()(query_planner() called it), where
planner had
got partition information(parts is 2). After
RelationCacheInvalidateEntry(), we will re-get
partition information in executor init phase.(parts is 1).

Here is where I think the problem occurs -- I mean, surely
> PlanCacheRelCallback marking the plan as ->is_valid=false should cause
> the prepared query to be replanned, and at that point the replan would
> see that the partition is no more. So by the time we get to this:
>

>
> > Entering ExecInitAppend(), because part_prune_info is not null, so we
> will
> > enter CreatePartitionPruneState().
> > We enter find_inheritance_children_extended() again to get partdesc, but
> in
> > gdb1 we have done DetachPartitionFinalize()
> > and the detach has commited. So we only get one tuple and parts is 1.
>
> we have made a new plan, one whose planner partition descriptor has only
> one partition, so it'd match what ExecInitAppend sees.
>
> Evidently there's something I'm missing in how this plancache
> invalidation works.
>

Hmm, I don't think this issue is closely related to plancache invalidation.

The scenario, which I created in [1], because has no cached plan, so we
will create a new plan.

After session1(doing detach) updated pg_inherits and before the first xact
commited,
the session2(doing execute) get the snapshot.
The session1 call WaitForLockersMultiple() before session2 get the parent
relation lock.

when the session2 will get the partition information in planner phase
so the tuple in pg_inherits session1 updated would be visibility to
session2,
find_inheritance_children_extended() has below codes:

------
xmin = HeapTupleHeaderGetXmin(inheritsTuple->t_data);
snap = GetActiveSnapshot();

if (!XidInMVCCSnapshot(xmin, snap))
------

So the planner will have 2 parts.

The session1 will continue do detach work, because it has cross
WaitForLockersMultiple(), so it will
call RemoveInheritance() to delete the tuple and add parent relation
invalid message.
The session2 will enter RelationCacheInvalidateEntry() in
deconstruct_jointree(), so partition information will
be clean.

When call ExecInitAppend(), we would get partition information again. The
session1 has done the work, so the session2
will get 1 pg_inherits tuple.

Finally, we hit the assert.

[1]
https://www.postgresql.org/message-id/CAHewXNkaKgVmT%2BOkVA9UHrEYm%2Bb8J6o_8%2B-84Qey6V5tM-%2Bz9A%40mail.gmail.com

Some key point:
1. the updated pg_inherits tuple should be visibility for select.
2. detach partition session can cross WaitForLockersMultiple()

We use two transactions to do detach partition concurrently.
I have a question: can two transaction make sure of consistency?
--
Tender Wang
OpenPie: https://en.openpie.com/

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Alexander Lakhin 2024-05-14 13:00:00 Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
Previous Message 周志勤 2024-05-14 08:09:24 Re: Re: Clarification needed: create partition table can be in another schema other than that of the parent table