Re: Event triggers + table partitioning cause server crash in current master

From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Event triggers + table partitioning cause server crash in current master
Date: 2017-05-15 13:49:42
Message-ID: 422A86B8-9F76-4FF5-8A8A-6F331459493C@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On May 14, 2017, at 11:02 PM, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
> On 2017/05/14 12:03, Mark Dilger wrote:
>> Hackers,
>>
>> I discovered a reproducible crash using event triggers in the current
>> development version, 29c7d5e4844443acaa74a0d06dd6c70b320bb315.
>> I was getting a crash before this version, and cloned a fresh copy of
>> the sources to be sure I was up to date, so I don't think the bug can be
>> attributed to Andres' commit. (The prior version I was testing against
>> was heavily modified by me, so I recreated the bug using the latest
>> standard, unmodified sources.)
>>
>> I create both before and after event triggers early in the regression test
>> schedule, which then fire here and there during the following tests, leading
>> fairly reproducibly to the server crashing somewhere during the test suite.
>> These crashes do not happen for me without the event triggers being added
>> to the tests. Many tests show as 'FAILED' simply because the logging
>> that happens in the event triggers creates unexpected output for the test.
>> Those "failures" are expected. The server crashes are not.
>>
>> The server logs suggest the crashes might be related to partitioned tables.
>>
>> Please find attached the patch that includes my changes to the sources
>> for recreating this bug. The logs and regression.diffs are a bit large; let
>> me know if you need them.
>>
>> I built using the command
>>
>> ./configure --enable-cassert --enable-tap-tests && make -j4 && make check
>
> Thanks for the report and providing steps to reproduce.
>
> It seems that it is indeed a bug related to creating range-partitioned
> tables. DefineRelation() calls AlterTableInternal() to add NOT NULL
> constraints on the range partition key columns, but the code fails to
> first initialize the event trigger context information. Attached patch
> should fix that.
>
> Thanks to the above test case, I also discovered that in the case of
> creating a partition, manipulations performed by MergeAttributes() on the
> input schema list may cause it to become invalid, that is, the List
> metadata (length) will no longer match the reality, because while the
> ListCells are deleted from the input list, the List pointer passed to
> list_delete_cell does not point to the same list. This caused a crash
> when the CreateStmt in question was subsequently passed to copyObject,
> which tried to access CreateStmt.tableElts that has become invalid as just
> described. The attached patch also takes care of that.

I can confirm that this fixes the crash that I was seeing. I have read
through the patch briefly, but will give it a more thorough review in the
next few hours.

Many thanks for your attention on this!

Mark Dilger

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ildus Kurbangaliev 2017-05-15 13:54:55 Re: Bug in ExecModifyTable function and trigger issues for foreign tables
Previous Message Sokolov Yura 2017-05-15 13:18:10 Re: Small improvement to compactify_tuples