From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: FailedAssertion("pd_idx == pinfo->nparts", File: "execPartition.c", Line: 1689) |
Date: | 2020-08-05 01:12:53 |
Message-ID: | CA+HiwqH=nLqn4e9j-Xwwz06fLeNh7rCBzp7iUG4iWrSMmWzBUQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Aug 5, 2020 at 10:04 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> On Wed, Aug 05, 2020 at 09:53:44AM +0900, Amit Langote wrote:
> > On Wed, Aug 5, 2020 at 9:52 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > On Wed, Aug 5, 2020 at 9:32 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > > > On Wed, Aug 05, 2020 at 09:26:20AM +0900, Amit Langote wrote:
> > > > > On Wed, Aug 5, 2020 at 12:11 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > > > > >
> > > > > > On Tue, Aug 04, 2020 at 08:12:10PM +0900, Amit Langote wrote:
> > > > > > > It may be this commit that went into PG 12 that is causing the problem:
> > > > > >
> > > > > > Thanks for digging into this.
> > > > > >
> > > > > > > to account for partitions that were pruned by the planner for which we
> > > > > > > decided to put 0 into relid_map, but it only considered the case where
> > > > > > > the number of partitions doesn't change since the plan was created.
> > > > > > > The crash reported here is in the other case where the concurrently
> > > > > > > added partitions cause the execution-time PartitionDesc to have more
> > > > > > > partitions than the one that PartitionedRelPruneInfo is based on.
> > > > > >
> > > > > > Is there anything else needed to check that my crash matches your analysis ?
> > > > >
> > > > > If you can spot a 0 in the output of the following, then yes.
> > > > >
> > > > > (gdb) p *pinfo->relid_map(at)pinfo->nparts
> > > >
> > > > I guess you knew that an earlier message has just that. Thanks.
> > > > https://www.postgresql.org/message-id/20200803161133.GA21372@telsasoft.com
> > >
> > > Yeah, you showed:
> > >
> > > (gdb) p *pinfo->relid_map(at)414
> > >
> > > And there is indeed a 0 in there, but I wasn't sure if it was actually
> > > in the array or a stray zero due to forcing gdb to show beyond the
> > > array bound. Does pinfo->nparts match 414?
>
> Yes. I typed 414 manually since the the array lengths were suspect.
>
> (gdb) p pinfo->nparts
> $1 = 414
> (gdb) set print elements 0
> (gdb) p *pinfo->relid_map(at)pinfo->nparts
> $3 = {....
> 21151836, 21151726, 21151608, 21151498, 21151388, 21151278, 21151168, 21151055, 2576248, 2576255, 2576262, 2576269, 2576276, 21456497, 22064128, 0}
Thanks. There is a 0 in there, which can only be there if planner was
able to prune that last partition. So, the planner saw a table with
414 partitions, was able to prune the last one and constructed an
Append plan with 413 subplans for unpruned partitions as you showed
upthread:
> (gdb) p *node->appendplans
> $17 = {type = T_List, length = 413, max_length = 509, elements = 0x7037400, initial_elements = 0x7037400}
This suggests that the crash I was able produce is similar to what you saw.
--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2020-08-05 01:21:05 | pg13dev: explain partial, parallel hashagg, and memory use |
Previous Message | Peter Geoghegan | 2020-08-05 01:06:16 | Re: new heapcheck contrib module |