Quick Links

Re: Inaccurate (sometimes wildly so) row estimates for simple join

From:	Greg Nolle <greg(dot)nolle(at)voidbridge(dot)com>
To:	Michael Lewis <mlewis(at)entrata(dot)com>
Cc:	PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject:	Re: Inaccurate (sometimes wildly so) row estimates for simple join
Date:	2020-05-22 14:42:46
Message-ID:	CA+wHfn-iS3D5gLQG8fJrJvOZvT7v-dWKJuLPg+qPj=3uv_7RXg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 22 May 2020 at 15:30:42, Michael Lewis (mlewis(at)entrata(dot)com) wrote:

On Fri, May 22, 2020 at 7:27 AM Greg Nolle <greg(dot)nolle(at)voidbridge(dot)com>
wrote:

> The crux seems to be that test_b does not have an even distribution for
> a_id values: it only has records for two of the values in the referenced
> table. This is how our real dataset is too and isn’t something we can
> really change.
>

How does it perform with extended statistics created for these two columns?

https://www.postgresql.org/docs/current/sql-createstatistics.html

I forgot to mention that I had already tried that actually, and it sadly
made no difference. I tried both dependencies and ndistinct modes and the
estimates remained the same. Having read through the docs I think that’s to
be expected since dependencies is designed to help when you have multiple
predicates and ndistinct when you’re doing grouping. Neither of which is
the case with my query.

For good measure I just tried mcv mode on PG12 with the test script I
posted and that also made no difference.

In response to

Re: Inaccurate (sometimes wildly so) row estimates for simple join at 2020-05-22 14:30:16 from Michael Lewis

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2020-05-22 14:44:09	Re: btree_gist extension - gbt_cash_union return type
Previous Message	Michael Lewis	2020-05-22 14:30:16	Re: Inaccurate (sometimes wildly so) row estimates for simple join