Re: BUG #18806: When enable_rartitionwise_join is set to ON, the database shuts down abnormally

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Tender Wang <tndrwang(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, Amit Langote <amitlangote09(at)gmail(dot)com>, m_lingbin(at)126(dot)com, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Subject: Re: BUG #18806: When enable_rartitionwise_join is set to ON, the database shuts down abnormally
Date: 2025-02-12 12:20:41
Message-ID: CAExHW5unrwcyrV7K77OR36EqWDp33furQ_-MWeeev=G4Qwa6OA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Feb 12, 2025 at 5:41 PM Tender Wang <tndrwang(at)gmail(dot)com> wrote:
>
>
>
> Tender Wang <tndrwang(at)gmail(dot)com> 于2025年2月12日周三 19:48写道:
>>
>>
>>
>> Tender Wang <tndrwang(at)gmail(dot)com> 于2025年2月12日周三 17:48写道:
>>>
>>>
>>>
>>> Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> 于2025年2月12日周三 15:38写道:
>>>>
>>>> On Wed, 2025-02-12 at 07:01 +0000, PG Bug reporting form wrote:
>>>> > PostgreSQL version: 17.2
>>>> > Operating system: CentOS Linux release 7.9.2009 (Core)
>>>> > Description:
>>>> >
>>>> > After upgrading PG17.2, testing found that when enable_rartitionwise_join is
>>>> > set to ON, when executing a query, if the main query and subquery have the
>>>> > same table, the query will report an error.
>>>> >
>>>> > The database log shows the following error:
>>>> > LOG: server process (PID 24796) was terminated by signal 6: Aborted
>>>> > DETAIL: Failed process was running: select * from test t join orders o on
>>>> > t.order_id =o.order_id where t.order_id in (select order_id from orders);
>>>> > LOG: terminating any other active server processes
>>>>
>>>> That is a crash that may well indicate a PostgreSQL bug.
>>>>
>>>> However, without a way to reproduce the behavior, we won't be able to fix
>>>> the problem. Try to come up with a self-contained test case. It would
>>>> also be interesting to know which PostgreSQL extensions are present.
>>>>
>>>
>>> I can reproduce this crash on HEAD.
>>>
>>> postgres=# create table test(order_id int, name varchar);
>>> create table orders(order_id int, name char(10)) partition by hash (order_id);
>>> CREATE TABLE
>>> CREATE TABLE
>>> postgres=# set enable_partitionwise_join = on;
>>> SET
>>> postgres=# create table orders_p1 partition of orders for values with ( modulus 32, remainder 0);
>>> CREATE TABLE
>>> postgres=# explain select * from test t join orders o on
>>> t.order_id =o.order_id where t.order_id in (select order_id from orders);
>>> server closed the connection unexpectedly
>>> This probably means the server terminated abnormally
>>> before or while processing the request.
>>> The connection to the server was lost. Attempting reset: Succeeded.
>>>
>>
>> I debug this crash, I found this crash was related with below commit:
>> commit 5278d0a2e870c61f9374a7796b90e6f9f6a73638
>> Author: Amit Langote <amitlan(at)postgresql(dot)org>
>> Date: Mon Mar 25 12:02:40 2024 +0900
>>
>> Reduce memory used by partitionwise joins
>>
>>
>> the joinlist in this query is 1,2,5
>> 1 means table t
>> 2 means table orders
>> 5 also means table orders, which subquery is pulled up.
>> When we create join rel for (2,5), then wen enter try_partitionwise_join().
>> In this func, we call build_child_join_sjinfo() to build SpecialJoinInfo for child.
>> In build_child_join_sjinfo(), we use memcpy(), I think we should use copyObject().
>> Otherwise, the parent_sjinfo will be freed, for example, after call free_child_join_sjinfo(),
>> the parent_sjinfo min_lefthand looks like as below:
>>
>> (gdb) p *parent_sjinfo->min_lefthand
>> $7 = {type = 3951489872, nwords = 21994, words = 0x55eaeb8696b0}
>>
>> The above bitmap is invalid, so trigger the Assert(bms_is_valid_set(a));
>>
>> Attached patch is my fix. Any thoughts?

Thanks for the report and the fix. Ideally, following line should
allocate separate memory for the child's bitmapset.
sjinfo->min_lefthand = adjust_child_relids(sjinfo->min_lefthand,
left_nappinfos, left_appinfos);

Unless no relid in sjinfo->min_lefthand was translated according to
the given appinfos, adjust_child_relids() should create a new
Bitmapset. Can you please check why no translation happened. I will
try your reproduction tomorrow.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tender Wang 2025-02-12 12:33:30 Re: BUG #18806: When enable_rartitionwise_join is set to ON, the database shuts down abnormally
Previous Message Tender Wang 2025-02-12 12:11:28 Re: BUG #18806: When enable_rartitionwise_join is set to ON, the database shuts down abnormally