Re: Inheritance planner CPU and memory usage change since 9.3.2

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inheritance planner CPU and memory usage change since 9.3.2
Date: 2015-06-18 19:26:12
Message-ID: CA+TgmoZEQs2wHbhz-5AXF7A2Pk+SUajy=7LSmVBwBPS8JBTVPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 18, 2015 at 3:14 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
>> On 18 June 2015 at 14:48, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> I feel I might be missing a trick here. It seems unlikely to me that
>>> we actually need the entire append_rel_list for every subquery; and we
>>> almost certainly don't need to modify every element of the
>>> append_rel_list for every subquery. Even the ones that no
>>> ChangeVarNodes() call mutates still get deep-copied.
>
>> Yeah, you could probably pre-compute the indexes of the RTEs that need
>> to copied, outside of the big loop, and store them in a bitmapset.
>> Then, instead of copying the entire list of rowmarks/append_rel_infos
>> each time, you could just copy the ones that referred to those RTE
>> indexes (and only if the bitmapset was non-empty, which is the
>> equivalent of your second optimisation). However, for AppendRelInfos,
>> ChangeVarNodes() descends into the Vars in the translated_vars list,
>> so short-cutting the copying of the AppendRelInfo isn't obviously
>> safe. But, looking more closely, does ChangeVarNodes actually need to
>> examine translated_vars (the fall-through case) when child_relid isn't
>> the old rt_index? If not, that could be a big saving in cases like
>> this.
>
> I'm a bit surprised that duplicating the append_rel_list is a noticeable
> performance problem. It ought to be far smaller than the Query tree that
> we've always duplicated in this loop --- in particular, it's really a
> subset of what we have in the RTE list, no?

Well, append_rel_list has an AppendRelInfo for every RTE and that
contains a List (translated_vars) which in turn contains a Var node
for every column. I'm not sure how that compares to the RTE itself.
I think it's the cost of copying the translated_vars list that is
really the problem here - you can have 200 or 500 columns in a table,
so that's a Var and a ListCell for each one. In the problem cases,
the number of AppendRelInfo elements is a small percentage of the
number of Var nodes under them.

That having been said, I don't know how the size of all that compares
to the size of the Query. But I think the Query must be smaller,
because arranging to discard the AppendRelInfo and its translated_vars
list, and the per-subroot rowMarks list, after every call to
grouping_planner stops the memory blowup.

The whole translated_vars representation seems needless inefficient.
For a subquery, you probably need something like that. But for an
inheritance child, you just need to swap the varno and maybe remap
some varattnos. Indexing into a C array to grab the varattno seems
like it would be a heck of a lot more efficient than calling
list_nth(). It might be worth making AppendRelInfo support a choice
of representations so that we can use something more optimized for the
simple case while not losing the full generality when we need it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-06-18 19:35:43 Re: Inheritance planner CPU and memory usage change since 9.3.2
Previous Message Tom Lane 2015-06-18 19:14:14 Re: Inheritance planner CPU and memory usage change since 9.3.2