Re: Performance regression with PostgreSQL 11 and partitioning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Thomas Reiss <thomas(dot)reiss(at)dalibo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance regression with PostgreSQL 11 and partitioning
Date: 2018-06-05 08:44:49
Message-ID: 247f7f38-0da4-e52b-5a6c-2e077fc4a6c0@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/06/05 13:44, Ashutosh Bapat wrote:
> On Tue, Jun 5, 2018 at 5:50 AM, David Rowley wrote:
>> On 5 June 2018 at 01:35, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>> I was wondering if we can get rid of append_rel_list altogether. In
>>> your patch, you have saved AppendRelInfos by child_relid. So all the
>>> slots indexed by parent relid are empty. We could place AppendRelInfos
>>> by parent relid. Thus a given AppendRelInfo would be places at two
>>> places, one at the index of child relid and second at the index
>>> pointed by parent relid. That's fine even in case of multilevel
>>> inheritance since an intermediate parent has two relids one as a
>>> parent and other as a child.
>>>
>>> One problem with that we do not know how long the array would be to
>>> start with. But we could keep on repallocing the array to increase its
>>> size.
>>
>> To be honest, the patch was meant as a discussion topic for ways to
>> improve the reported regression in PG11. The patch was put together
>> fairly quickly.
>
> I think the idea is brilliant. I do not have objections for trying
> something in that direction. I am suggesting that we take this a bit
> forward and try to eliminate append_rel_list altogether.

Imho, we should try to refrain from implementing something that needs
repalloc'ing the array, as long as we're doing it as a fix for PG 11.
Also, because getting rid of append_rel_list altogether seems a bit involved.

Let the AppendRelInfo's be added to append_rel_list when created, as is
done now (expand_inherited_tables), and transfer them to the array when
setting up various arrays (setup_simple_rel_arrays) as the David's patch does.

>> I've not really considered what happens in the case where an inherited
>> table has multiple parents. The patch happens to pass regression
>> tests, but I've not checked to see if the existing tests create any
>> tables this way.
>
> AFAIK, that case doesn't arise while processing a query. Examining the
> way we create AppendRelInfos during expand_inherited_tables(), it's
> clear that we create only one AppendRelInfo for a given child and also
> for a given child and parent pair. If there are two tables from which
> a child inherits, the child's RTE/RelOptInfo is associated only with
> the top-most parent that appears in the query. See following comment
> from find_all_inheritors()
> /*
> * Add to the queue only those children not already seen. This avoids
> * making duplicate entries in case of multiple inheritance paths from
> * the same parent. (It'll also keep us from getting into an infinite
> * loop, though theoretically there can't be any cycles in the
> * inheritance graph anyway.)
> */

That's right.

create table p1 (a int);
create table p2 (b int);
create table c () inherits (p1, p2);

Child table 'c' has two parents but if a query specifies only p1, then
just one AppendRelInfo corresponding to p1-c pair will be created. If p2
was also specified in the query, there will be an AppendRelInfo for p2-c
pair too, but its child_relid will refer to another RT entry that's
created for 'c', that is, the one created when expanding p2.

>> Perhaps it's okay to change things this way for just partitioned
>> tables and leave inheritance hierarchies alone.
>
> There's no point having two separate code paths when internally we are
> treating them as same.

+1

> The only difference between partitioning and
> inheritance is that for multi-level partitioned table, we preserve the
> inheritance hierarchy whereas for multi-level inheritance, we flatten
> it.

Yeah, the proposed patch doesn't change anything about how many
AppendRelInfo's are created, nor about what's contained in them, so it
seems safe to say that it would work unchanged for both regular
inheritance and partitioning.

Thanks,
Amit

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-06-05 08:49:39 Re: Remove mention in docs that foreign keys on partitioned tables are not supported
Previous Message Tomas Vondra 2018-06-05 08:05:35 Re: Spilling hashed SetOps and aggregates to disk