Re: Planning performance problem (67626.278ms)

From: Manuel Weitzman <manuelweitzman(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jeremy Schneider <schnjere(at)amazon(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Planning performance problem (67626.278ms)
Date: 2021-06-29 21:35:38
Message-ID: F9C9FBE0-C535-49DA-B913-F11AC30F8F4E@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


> On 29-06-2021, at 15:43, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Manuel Weitzman <manuelweitzman(at)gmail(dot)com> writes:
>>> On 20-06-2021, at 17:06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> So ... the reason why there's not caching of get_actual_variable_range
>>> results already is that I'd supposed it wouldn't be necessary given
>>> the caching of selectivity estimates that happens at the RestrictInfo
>>> level. I don't have any objection in principle to adding another
>>> caching layer if that one's not working well enough, but I think it'd
>>> be wise to first understand why it's needed.
>
>> For what I could make out from the code, the caching done at the
>> RestrictInfo level is already saving a lot of work, but there's a
>> different RestrictInfo instance for each alternative path created by
>> make_one_rel().
>
> That seems a bit broken; a given WHERE clause should produce only one
> RestrictInfo. Can you provide a more concrete example?
>

I added some logging to see hits and misses on cached_scansel() for
this query
> explain (analyze, buffers)
> select * from a
> join b b1 on (b1.a = a.a)
> join b b2 on (b2.a = a.a)
> where b1.a in (1,100,10000,1000000,1000001);

Apparently there's a RestrictInfo for each possible way of doing merge
join (are those created dynamically for planning?), for example:
- a join (b1 join b2)
- b1 join (a join b2)
- b2 join (a join b1)

When the cost of a possible mergejoin path hasn't been computed yet,
then mergejoinscansel() would have to check the bloated index again.

I attached a patch so you can see the hits and misses on cached_scansel().
Each time there's a miss logged, there's also a different RestrictInfo
pointer involved.

Best regards,
Manuel

Attachment Content-Type Size
cached_scansel_hitmiss.patch application/octet-stream 1.3 KB

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2021-06-29 22:31:56 Re: Planning performance problem (67626.278ms)
Previous Message Tom Lane 2021-06-29 19:43:28 Re: Planning performance problem (67626.278ms)