BUG #18653: Is it necessary for ATExecDropInherit to acquire an AccessExclusiveLock on the parent table?

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: feichanghong(at)qq(dot)com
Subject: BUG #18653: Is it necessary for ATExecDropInherit to acquire an AccessExclusiveLock on the parent table?
Date: 2024-10-13 07:23:44
Message-ID: 18653-f2a96d3b52fd95a0@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18653
Logged by: Fei Changhong
Email address: feichanghong(at)qq(dot)com
PostgreSQL version: 17.0
Operating system: Operating system: centos 8,Kernel version: 5.10.13
Description:

Hi all,

I encountered a puzzling issue when executing "ALTER TABLE NO INHERIT"
concurrently with "SELECT": In an SQL statement, there are two identical
subqueries, but they return different results. This can be reproduced with
the following spec test:

```
setup
{
CREATE TABLE p (a integer);
INSERT INTO p VALUES(1);
CREATE TABLE c1 () INHERITS (p);
INSERT INTO c1 VALUES(10);
CREATE TABLE c2 (a integer);
INSERT INTO c2 VALUES(100);
}

teardown
{
DROP TABLE IF EXISTS c1, c2, p;
}

session s1
step s1b { BEGIN; }
step s1delc1 { ALTER TABLE c1 NO INHERIT p; }
step s1c { COMMIT; }

session s2
step s2sel { explain analyze select a from p group by a union all select a
from p group by a; }

permutation s1b s1delc1 s2sel s1c
```

The result of the SELECT is as follows: one of the UNION ALL nodes returns 2
rows, while the other returns 1 row:

```
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------
Append (cost=60.09..70.28 rows=400 width=4) (actual time=0.022..0.030
rows=3 loops=1)
-> HashAggregate (cost=60.09..62.09 rows=200 width=4) (actual
time=0.022..0.023 rows=2 loops=1)
Group Key: p.a

Batches: 1 Memory Usage: 40kB

-> Append (cost=0.00..53.07 rows=2805 width=4) (actual
time=0.007..0.014 rows=2 loops=1)
-> Seq Scan on p p_1 (cost=0.00..3.55 rows=255 width=4)
(actual time=0.007..0.007 rows=1 loops=1)
-> Seq Scan on c1 p_2 (cost=0.00..35.50 rows=2550 width=4)
(actual time=0.005..0.005 rows=1 loops=1)
-> HashAggregate (cost=4.19..6.19 rows=200 width=4) (actual
time=0.005..0.005 rows=1 loops=1)
Group Key: p_3.a

Batches: 1 Memory Usage: 40kB

-> Seq Scan on p p_3 (cost=0.00..3.55 rows=255 width=4) (actual
time=0.002..0.002 rows=1 loops=1)
Planning Time: 10.051 ms

Execution Time: 0.161 ms
```

Based on my analysis, the issue occurs because "ALTER TABLE NO INHERIT" only
holds an AccessExclusiveLock on the parent table.

When generating the plan for the first subquery, find_all_inheritors uses
the catalog snapshot from before the DDL is committed, so it sees table c1,
but gets blocked when trying to lock c1. For the second subquery,
find_all_inheritors uses the catalog snapshot from after the DDL is
committed, so it doesn't see c1.

Is this behavior expected? Alternatively, should we acquire an
AccessExclusiveLock on the parent table in ATExecDropInherit? This might
lead to a deadlock.

Browse pgsql-bugs by date

  From Date Subject
Next Message Ba Jinsheng 2024-10-13 08:57:45 Re: Question of Parallel Hash Join on TPC-H Benchmark
Previous Message Andrei Lepikhov 2024-10-13 03:09:29 Re: Question of Parallel Hash Join on TPC-H Benchmark