Re: Cardinality estimate of the inner relation

From: Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Cc: Christophe Courtois <christophe(dot)courtois(at)dalibo(dot)com>
Subject: Re: Cardinality estimate of the inner relation
Date: 2024-11-25 12:46:21
Message-ID: fe839a18-6bd5-49c2-a59f-52c726ee2302@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 11/23/24 03:07, Andrei Lepikhov wrote:
> Thanks for the case provided!
>

Thanks for your answer!

> I wonder if data science has invented a statistic or selectivity
> estimation technique that could tackle your case in general. As I see,
> we should touch the table of products to realise which specific ID has
> the product named 'Babar'.

There are many such IDs in my test case (only one in our initial
customer's case, but I thought the test case should be more generic). It
doesn't really change the reasoning much, but I say it for the sake of
accuracy.

> I can imagine the trick when you build MCV on (id, name) and have a
> chance to find this popular ID, cache it, and use it during join clause
> estimation later. But it seems too expensive to do the same for
> arbitrary incoming queries.
>

I'm not sure that I understand the idea, but I've never tried to hack
into the planner's code yet, so...

> If you want a workaround, such cases fit query-driven techniques. Among
> open-source ones, I can point your attention to the AQO extension
> (honestly, designed under my command). It can save information about an
> estimation error and correct the query next time.

Interesting, thanks!

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ba Jinsheng 2024-11-25 20:54:36 Performance of TPC-DS Query 95
Previous Message Andrei Lepikhov 2024-11-24 12:04:17 Re: Performance of Query 60 on TPC-DS Benchmark