Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Tender Wang <tndrwang(at)gmail(dot)com>
Cc: 1026592243(at)qq(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943
Date: 2024-06-12 19:53:00
Message-ID: 202406121953.gfdukghim5d2@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2024-Jun-11, Alvaro Herrera wrote:

> ... and actually, the code that maps partitions when these arrays don't
> match is all wrong, because it assumes that they are in OID order, which
> is not true, so I'm busy rewriting it. More soon.

So I came up with the algorithm in the attached patch. As far as I can
tell, it works okay; I've been trying to produce a test that would
stress it some more, but I noticed a pgbench shortcoming that I've been
trying to solve, unsuccessfully.

Anyway, the idea here is that we match the partdesc->oids entries to the
pinfo->relid_map entries with a distance allowance -- that is, we search
for some OIDs a few elements ahead of the current position. This allows
us to skip some elements that do not match, without losing sync of the
correct position in the array. This works because 1) the arrays must be
in the same order, that is, bound order; and 2) the amount of elements
that might be missing is bounded by the difference in array lengths.

As I said, I've been hammering it with some modified pgbench scripts;
mainly I did this to set up:

drop table if exists p;
do $$ declare i int; begin for i in 0..99 loop execute format('drop table if exists p%s', i); end loop; end $$;
drop sequence if exists detaches;
create table p (a int, b int) partition by list (a);
set client_min_messages=warning;
do $$
declare i int;
declare modulus int;
begin
for modulus in 0 .. 4 loop
for i in 0..99 loop
if i % 5 <> modulus then
continue;
end if;
execute format('create table p%s partition of p for values in (%s)', i, i); end loop;
end loop;
end $$;
reset client_min_messages;
create sequence detaches;

which ensures the partitions are not in OID order, and then used this
pgbench script

\set part random(0, 89)

select pg_try_advisory_lock(:part)::integer AS gotlock \gset
\if :gotlock

select pg_advisory_lock(142857);
alter table p detach partition p:part concurrently;
select pg_advisory_unlock(142857);

\set slp random(100, 200)
\sleep :slp us

alter table p attach partition p:part for values in (:part);

select pg_advisory_unlock(:part), nextval('detaches');
\endif

which detaches some partitions randomly, together with the other one

\set id random(0,99)
select * from p where a = :id;

script which reads from the partitioned table.

This setup would fail really quickly with the original code, and with
the patched code it can run a total of some 6700 detach/attach cycles in
60 seconds. This seems quite slow, and in fact looking at the total
number of partitions in pg_inherits, we have either 99 or 100 partitions
almost the whole time. That's why I'm trying to modify pgbench ...
I think the problem is that pg_advisory_lock() holds a snapshot which
causes a concurrent detach partition to wait for it, or something like
that. I added a \lock command, but it doesn't seem to work the way I
want it to.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"I'm always right, but sometimes I'm more right than other times."
(Linus Torvalds)
https://lore.kernel.org/git/Pine(dot)LNX(dot)4(dot)58(dot)0504150753440(dot)7211(at)ppc970(dot)osdl(dot)org/

Attachment Content-Type Size
v3-0001-Fix-partition-pruning-setup-during-DETACH-CONCURR.patch text/x-diff 8.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2024-06-12 21:41:05 Re: BUG #18503: Reproducible 'Segmentation fault' in 16.3 on ARM64
Previous Message Klaus P. 2024-06-12 15:51:38 PostgreSQL 16.3 install fails on Windows with domain user if a local user exists with the same name