Re: Inaccurate (sometimes wildly so) row estimates for simple join

From: Greg Nolle <greg(dot)nolle(at)voidbridge(dot)com>
To: Michael Lewis <mlewis(at)entrata(dot)com>
Cc: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Inaccurate (sometimes wildly so) row estimates for simple join
Date: 2020-05-22 14:42:46
Message-ID: CA+wHfn-iS3D5gLQG8fJrJvOZvT7v-dWKJuLPg+qPj=3uv_7RXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 22 May 2020 at 15:30:42, Michael Lewis (mlewis(at)entrata(dot)com) wrote:

On Fri, May 22, 2020 at 7:27 AM Greg Nolle <greg(dot)nolle(at)voidbridge(dot)com>
wrote:

> The crux seems to be that test_b does not have an even distribution for
> a_id values: it only has records for two of the values in the referenced
> table. This is how our real dataset is too and isn’t something we can
> really change.
>

How does it perform with extended statistics created for these two columns?

https://www.postgresql.org/docs/current/sql-createstatistics.html

I forgot to mention that I had already tried that actually, and it sadly
made no difference. I tried both dependencies and ndistinct modes and the
estimates remained the same. Having read through the docs I think that’s to
be expected since dependencies is designed to help when you have multiple
predicates and ndistinct when you’re doing grouping. Neither of which is
the case with my query.

For good measure I just tried mcv mode on PG12 with the test script I
posted and that also made no difference.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2020-05-22 14:44:09 Re: btree_gist extension - gbt_cash_union return type
Previous Message Michael Lewis 2020-05-22 14:30:16 Re: Inaccurate (sometimes wildly so) row estimates for simple join